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Title Of. Invention 

Hepatitis Genome Of Infectious 

Hepatxtxs C Vxrus of Genotype 2a And Uses Thereof 

liel d Of Invent inn 

acnro h ^""^ invention relates to molecular 

approaches to the production of nucleic acid . 
which comprises th. nucleic acid sequence 

virus T °' infectious hepatitis C 

acTs tion provides a nucleic 

ac d sequence which comprises the genome of an 

infectious hepatitis C virus of 

Virus ot genotype 2a. The 
invention therefore re^^^<.o . 

acid °^ nucleic 

acid sequence and polypeptides encoded by all or part of 
the seguence» in +-^,^ , "-^a ur part of 

quence in the development of vaooinea and 
diagnostic assays for „cv and In the development of 
screening assays fo. the Identification of antiviLl 
agents for HCV. «nciviral 

20 

Background Of invpn^,-^„ 

Hepatitis C virus (ur\T\ k 
cin^i ^ ^^^^ 3 positive-sense 

3xn,le-strand genome and is a .ember of the ,enus 
^pacivi^s Within the riaviviri.ae family of vir" L 
« Kice, 1,^„ . ,3 positive-stranded HK. vLls 

the .enome of HCV functions as mR»* from which all v ral 
proteins necessary for propagation are translated. 

The viral genome of hpv j „ 
n„oi^ ..-^ "ine ot HCV is approximately 9600 

» in length and consists of a highly 

conserved 5' untranslated region (WR), a single long 
open reading frame ,o.., of approximately ,,„„o nts and 
a complex 3' QTR. The 5' utr ^ • 

^ contains an internal 

ribosomal entry site (Tsukiyama-Kohara et al., 1992; 
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Honda et al., 1996). The 3' UTR consists of a short 
variable region, a polypyrimidine tract of variable 
length and, at the 3' end, a highly conserved region of 
approximately 100 nucleotides (Kolykhalov et al., 1996; 
5 Tanaka et al., 1995; Tanaka et al., 1996; Yamada et al., 
1996) . The last 46 nucleotides of this conserved region 
were predicted to form a stable stem-loop structure 
thought to be critical for viral replication (Blight and 
jQ Rice, 1997; Ito and Lai, , 1997; Tsuchihara et al . , 1997). 
The ORF encodes a large polypeptide precursor that is 
cleaved into at least 10 proteins by host and viral 
proteinases (Rice, 1996) . The predicted envelope 
proteins contain several conserved N-linked 
glycosylation sites and cysteine residues (Okamoto et 
al., 1992a). The NS3 gene encodes a serine protease and 
an RNA helicase and the NS5B gene encodes an RNA- 
dependent RNA polymerase. 
20 ^ remarkable characteristic of HCV is its 

genetic heterogeneity, which is manifested throughout 
the genome (Bukh et al., 1995). The most heterogeneous 
regions of the genome are found in the envelope genes, 
in particular the hypervariable region 1 (HVRl) at the 
N-terminus of E2 (Hijikata et al., 1991; Weiner et al., 
1991) . HCV circulates as a quasispecies of closely 
related genomes in an infected individual. Globally, 
six major HCV genotypes (genotypes 1-6) and multiple 
subtypes (a, b, c, etc.) have been identified (Bukh et 
al., 1993; Simmonds et al., 1993). 

The nucleotide and deduced amino acid 
sequences among isolates within a quasispecies generally 
differ by < 2%, whereas those between isolates of 
different genotypes vary by as much as 35%. Genotypes 
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1, 2 and 3 are found worldwide and constitute more than 
90% of the HCV infections in North and South America, 
Europe, Russia, China, Japan and Australia (Forns and 
Bukh, 1998). Throughout these regions genotype 1 
accounts for the majority of HCV infections but 
genotypes 2 and 3 each account for 5-15%. 

At present, more than 80% of individuals 
infected with HCV become chronically infected and these 
chronically infected individuals have a relatively high 
risk of developing chronic hepatitis, liver cirrhosis 
and hepatocellular carcinoma (Hoofnagle, 1997), The 
only effective therapy for chronic hepatitis C, 
interferon (IFN) , alone or in combination with' 
ribavirin, induces a sustained response in less than 50% 
Of treated patients (Davis et al., 1998; McHutchiason et 
al., 1998). Consequently, HCV is currently the most 
common cause of end stage liver failure and the reason 
for about 30% of liver transplants performed in the U.S. 
(Hoofnagle, 1997). m addition, a number of " recent 
studies suggested that the severity of liver disease and 
the outcome of therapy may be genotype-dependent 
(reviewed in Bukh et al., 1997), m particular, these 
studies suggested that infection with HCV genotype lb 
was associated with more severe liver disease (Brechot, 
1997) and a poorer response to IFN therapy (Fried and 
Hoofnagle, 1995). As a result of the inability to 
develop a universally effective therapy against HCV 
infection, it is estimated that there are still more 
than 25,000 new infections yearly in the U.S. (Alter 
1997) Moreover, since there is no vaccine for HCV, HCV 
remains a serious public health problem. 
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Despite the intense interest in the 
development of vaccines and therapies for HCV, progress 
has been hindered by the absence of a useful cell 
culture system and the lack of any small animal model 
for laboratory study. For example, while replication of 
HCV in several cell lines has been reported, such 
observations have turned out not to be highly 
reproducible. In addition, the chimpanzee is the only 
animal model, other than man, for this disease. 
Consequently, HCV has been studied only by using 
clinical materials obtained from patients or 
experimentally infected chimpanzees, an animal model 
whose availability is very limited. 

However, several researchers have recently 
reported the construction of infectious cDNA clones of 
HCV, the identification of which would permit a more 
effective search fdr susceptible cell lines and 
facilitate molecular analysis of the viral genes and 
their function. For example, Yoo et al., and Dash et 
al., (1997) (1995) reported that RNA transcripts from 
CDNA clones of HCV-1 (genotype la) and HCV-N (genotype 
lb), respectively, resulted in viral replication after 
transfection into human hepatoma cell lines. 
Unfortunately, the viability of these clones was not 
tested in vivo and concerns were raised about the 
infectivity of these cDNA clones in vitro (Fausto, 
1997). In addition, both clones did not contain the 

terminal 98 conserved nucleotides at the very 3' end of 
the UTR. 

Kolykhalov et al., (1997) and Yanagi et al. 
(1997, 1998) reported the derivation from HCV strains 
H77 (genotype la) and HC-J4 (genotype lb) of cDNA clones 



wo 00/75338 



PCT/USOO/15446 



10 



15 



- 5 - 

of HCV that are infectious for chimpanzees. However, 
while these infectious clones will aid in studying HCV 
replication and pathogenesis and will provide an 
important tool for development of in vitro replication 
and propagation systems, it is important to have 
infectious clones of more than one genotype, given the 
extensive genetic heterogeneity of HCV and the potential 
impact of such heterogeneity on the development of 
effective therapies and vaccines for HCV. 

In addition, synthetic chimeric viruses can be 
used to map the functional regions of viruses with 
different phenotypes. m flaviviruses and pestiviruses, 
infectious chimeric viruses have been successfully 
engineered to express different functional units of 
related viruses (Bray and Lai, 1991; Pletnev et al., 
1992, 1998; Vassilev et ai., 1997) and in some cases it 
has been possible to make chimeras between non-related 
or distantly related viruses. For instance, the IRES 
element of poliovirus or bovine viral diarrhea virus has 
been replaced with IRES sequences from HCV (Frolov et 
ai., 1998; Lu and Wimmer, 1996; Zhao et aJ., 1999). 
Recently, the construction of an infectious chimera of 
two closely related HCV subtypes has been reported. The 
chimera contained the complete ORF of a genotype lb 
strain but had the 5' and 3' termini of a genotype la 
strain (Yanagi et al., 1998). 

30 important to determine whether chimeras 

constructed from more divergent HCV strains are 
infectious because such chimeras could be used to define 
the functions of viral units and to dissect the immune 
response. 
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Summary Of The Invention 

The present invention relates to nucleic' acid 
sequence which comprises the genome of infectious 
hepatitis C virus and in particular, nucleic acid 
5 sequence which comprises the genome of infectious 

hepatitis C virus of genotype 2a. It is therefore an 
object of the invention to provide nucleic acid sequence 
which encodes infectious hepatitis C virus. Such 
nucleic acid sequence is referred to throughout the 
application as "infectious nucleic acid sequence". 

For the purposes of this application, nucleic 
acid sequence refers to RNA, DNA, cDNA or any variant 
thereof- capable of directing host organism synthesis of 
a hepatitis C virus polypeptide. It is understood that 
nucleic acid sequence encompasses nucleic acid 
sequences, which due to degeneracy, encode the same 
polypeptide sequence as the nucleic acid sequences 
described herein. 

The invention also relates to the use of the 
infectious nucleic acid sequences to produce chimeric 
genomes consisting of portions of the open reading 
frames of nucleic acid sequences of other genotypes 
(including, but not limited to, genotypes 1, 2, 3, 4, 5 
and 6) and subtypes (including, but not limited to, 
subtypes la, lb, 2a, 2b, 2c, 3a, 4a-4f, 5a and 6a) of 
HCV. For example, infectious nucleic acid sequence of 
the 2a strain HC-J6, described herein can be used to 
produce chimeras with sequences from the genomes of. 
other strains of HCV from different genotypes or 
subtypes. Nucleic acid sequences which comprise 
sequences from two or more HCV genotypes or subtypes are 
designated "chimeric nucleic acid sequences". 
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The invention further relates to mutations of 
the infectious nucleic acid sequence of the invention 
where mutation includes, but is not limited to, point 
mutations, deletions and insertions. In one embodiment, 
a gene or fragment thereof can be deleted to determine 
the effect of the deleted gene or genes on the 
properties of the encoded virus such as its virulence 
and its. ability to replicate. In an alternative 
embodiment, a mutation may be introduced into the 
infectious nucleic acid sequences to examine the effect 
of the mutation on the properties of the virus. 

The invention also relates to the introduction 
of mutations or deletions into the infectious nucleic 
15 acid sequence in order to produce an attenuated 

hepatitis C virus suitable for vaccine development. 

The invention further relates to the use of 
the infectious nucleic acid sequence to produce 
attenuated viruses via passage in vitro or in vivo of 
the viruses produced by transfection of a host cell with 
the infectious nucleic acid sequence. 

The present invention also relates to the use 
of the nucleic acid sequence of the invention or 
fragments thereof in the production of polypeptides 
where "nucleic acid sequence of the invention" refers to 
infectious nucleic acid sequence, mutations of 
infectious nucleic acid sequence, chimeric nucleic acid 
sequence and sequences which comprise the genome of 
attenuated viruses produced from the infectious nucleic 
acid sequence of the invention. In one embodiment, said 
polypeptide or polypeptides are fully or partially 
purified from hepatitis C virus produced by cells 
transfected with nucleic acid sequence of the invention. 
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In another embodiment, the polypeptide or polypeptides 
are produced recombinantly from a fragment of the 
nucleic acid sequences of the invention. In yet another 
embodiment, the polypeptides are chemically synthesized. 

The polypeptides of the invention, especially 
structural polypeptides, can serve as immunogens in the 
development of vaccines or as antigens in the 
development of diagnostic assays for detecting the 
presence of HCV in biological samples. 

The invention therefore also relates to 
vaccines for use in immunizing mammals especially humans 
against hepatitis C. In one embodiment, the vaccine 
comprises one or more polypeptides made from the nucleic 
acid sequence of the invention or fragment thereof, f In 
a second embodiment, the vaccine comprises a hepatitis C 
virus produced by transfection of host cells with the 
nucleic acid sequences of the invention. 

The present invention therefore relates to 
methods for preventing hepatitis C in a mammal. In one 
embodiment the method comprises administering to a 
mammal a polypeptide or polypeptides encoded by the 
nucleic acid sequence of the invention in an amount 
effective to induce protective immunity to hepatitis C. 
In another embodiment, the method of prevention 
comprises administering to a mammal a hepatitis C virus 
of the invention in an amount effective to induce 
protective immunity against hepatitis C. 

In yet another embodiment, the method of 
protection comprises administering to a mammal the 
nucleic acid sequence of the invention or a fragment 
thereof in an amount effective to induce protective 
immunity against hepatitis C. 
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The invention also relates to hepatitis C 
viruses produced by host cells transfected with the 
nucleic acid sequence of the present invention. 

The invention therefore also provides 
pharmaceutical compositions comprising the nucleic acid 
sequence of the invention and/or the encoded hepatitis C 
viruses. The invention further provides pharmaceutical 
compositions comprising polypeptides encoded by the 
nucleic acid sequence of the invention or fragments 
thereof. The pharmaceutical compositions of the 
invention may be used prophylactically or 
therapeutically. 

The invention also relates to antibodies to 
the hepatitis C virus of the invention or their encoded • 
polypeptides and to pharmaceutical compositions 
comprising these antibodies. 

The invention also relates to the use of the 
nucleic acid sequences of the invention to identify cell 
lines capable of supporting the replication 6f HCV in 
vitro . 

The invention further relates to the use of 
the nucleic acid sequences of the invention or their 
encoded viral enzymes (e.g. NS3 serine protease, NS3 
helicase, NS5B RNA polymerase) to develop screening 
assays to identify antiviral agents for HCV. 

Brief Description Of Figures 
Figure 1 shows the amplification and cloning 
of hepatitis C virus genotype 2a (strain HC-J6ch). The 
nucleotide positions correspond to the sequence of 
PJ6CF, a full length cDNA clone of hepatitis C virus,, 
genotype 2a, strain HC-J6ch. Products from polymerase 
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chain reaction are also shown. The names of the clones 
obtained from these products are indicated (number of 
clones sequenced are shown in parenthesis) . The 
composition of the full-length cDNA clone is shown at 
the bottom. The restriction enzymes used for cloning ■ 
are indicated. An Xbal site in HC-J6ch was eliminated by 
a silent substitution at position 5494. 

Figure 2 shows tree analysis of clones 
amplified from an infectious acute phase plasma pool 
generated in a chimpanzee inoculated with human plasma 
containing strain HC-J6 (Okamoto et aJ., 1991) as well 
as a tree of the predicted polyprotein sequence of 
HC-J6cH .and the infectious HC-J6ch cDNA clone (pJ6CF>.. 
15 The nucleotide positions with deletions or insertions 
were stripped in the analysis of the clones. Multiple 
sequence alignments and tree analyses were performed 
with GeneWorks (Oxford Molecular Group) (Bukh et aJ., 
1995). Genotype designations are indicated. Other 
sequences included in the analysis are HC-J8' (Okamoto et 
ai., 1992), genotype la infectious clone BEBEl (Nakao et 
ai., 1996), H77C (Yanagi et al., 1997); genotype lb 
infectious clone J4L6S (Yanagi et ai., 1998). The scale 
in each tree indicates the calculated genetic distance. 

Figure 3 shows the alignment of the 
hypervariable region 1 sequences from 8 J6S clones of 
strain HC-J6ch. HC-J6ch represents the consensus amino 
acid sequence of the infectious plasma pool from an 
experimentally infected chimpanzee. HC-J6 is the 
published amino acid sequence of the original inoculum 
(Okamoto et al., 1991). 

Figure 4 shows the construction of four 
intertypic chimeric cDNA clones. White boxes are 
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sequences derived from genotype 2a clone pJ6CF, and 
black boxes are sequences derived from genotype la clone 
PCV-H77C (Yanagi et ai., 1997). An Ndel site (mutation 
at position 9158 of pCV-H77C) was eliminated and an 
artificial Ndel site (mutation at position 2765 of 
PCV-H77C) was created by site-directed mutagenesis; 
silent mutations are underlined. 

Figures 5A and 5B show the alignment of the 
nucleotide sequences of the 5' (Fig. 5A) and 3 ' UTRs 
(Fig. 5B) and the amino acid sequences of E2/p7/NS2 
junctions (Fig. 5B) in the intertypic la, 2a chimeric 
cDNA clones. In the 5' UTR alignment, the first 39 nts 
of core believed to be important for the IRES function 
were included (Lemon and Honda, 1997) . Top line: the 
sequence of the infectious genotype la clone pCV-H77C 
(Yanagi et al., 1997). Bottom line: the sequence of the 
infectious genotype 2a clone pJ6CF. Dot: identity with 
the sequence of H77C. Capital letter: different from the 
sequenxre of H77C. Dash: deletion. Bold face: initiation 
or stop codon of the ORF. Underlined: Agel cleavage 
site. Arrow: putative sites in the HCV polyprotein 
cleaved by host signal peptidases. Numbering 
corresponds to the sequence of pCV-H77C. 

Figures 6A-6F show the nucleotide sequence of 
the infectious hepatitis C virus clone of genotype la 
strain H77C and Figures 6G-6H show the amino acid 
sequence encoded by the clone. 
30 Figures 7A-7F show the nucleotide sequence of 

the infectious hepatitis C virus clone of genotype lb 
strain HC-J4 and Figures 7G-H show the amino acid 
sequence encoded by the clone, 

35 

SUBSTITUTE SHEET (RULE 26) 
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DESCRIPTION OF THE INVENTION 

The present invention relates to nucleic acid, 
sequence which comprises the genome of an infectious 
hepatitis C virus. More specifically, the invention 
relates to nucleic acid sequence which encodes 
infectious hepatitis C virus of strain HC-J6ch, genotype 
2a. The infectious nucleic acid sequence of the 
invention is shown in SEQ ID N0:1 and is contained in a 
plasmid construct deposited with the American Type 
Culture Collection (ATCC) on May 28, 1999 and having 
ATCC accession number PTA-153. 

The invention also relates to "chimeric 
nucleic acid sequences" where the chimeric nucleic acid 
sequences consist of open-reading frame sequences and/or 
5' and/or 3' untranslated sequences taken from nucleic 
acid sequences of hepatitis C viruses of different 
genotypes or subtypes. 

In one embodiment, the chimeric nucleic acid 
sequence consists of sequence from the genome of 
infectious HCV of genotype 2a which encodes structural 
polypeptides and sequence from the genome of a HCV of a 
different genotype or subtype which encodes 
nonstructural polypeptides. 

Alternatively, the nonstructural region of 
infectious HCV of genotype 2a and structural region of a 
HCV of a different genotype or subtype may be combined. 
This will result in a chimeric nucleic acid sequence 
consisting of sequence from the genome of infectious HCV 
of genotype 2a which encodes nonstructural polypeptides 
and sequence from the genome of a HCV of a another 
genotype or subtype which encodes structural 
polypeptides. 
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Preferably, the nucleic acid sequence from the 
genome of the infectious HCV clone of genotype la 
(deposited with the ATCC on June 2, 1999 ; Figures 6A- 
6F), or the nucleic acid sequence from the genome of the 
infectious HCV clone of genotype lb (ATCC accession 
number 209596; Figures 7A-7F) is used to construct the 
chimeric nucleic acid sequence with the HCV of genotype 
2a of the invention. 

It is believed that the construction of such 
chimeric nucleic acid sequences will be of importance in 
studying the growth and virulence properties of 
hepatitis C virus and in the production of candidate 
hepatitis C virus vaccines suitable to confer protection 
against multiple genotypes of HCV. For example, one 
might produce a "multivalent" vaccine by putting 
epitopes from several genotypes or subtypes into one 
clone. Alternatively one might replace just a single 
gene from an infectious sequence with the corresponding 
gene from the genomic sequence of a strain from another 
genotype or subtype or create a chimeric gene which 
contains portions of a gene from two genotypes or 
subtypes. Examples of genes which could be replaced or 
25 which could be made chimeric, include, but are not 
limited to, the El, E2 and NS4 genes. 

The invention further relates to mutations of 
the infectious nucleic acid sequences where "mutations" 
include, but are not limited to, point mutations, 
deletions and insertions. Of course, one of ordinary 
skill in the art would recognize that the size of the 
insertions would be limited by the ability of the 
resultant nucleic acid sequence to be properly packaged 
within the virion. Such mutations could be produced by 
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techniques known to those of skill in the art such as 
site-directed mutagenesis, fusion PGR, and restriction 
digestion followed by religation. 

In one embodiment, mutagenesis might be 
undertaken to determine sequences that are important for 
viral properties such as replication or virulence. For 
example, one may introduce a mutation into the 
infectious nucleic acid sequence which eliminates the 
cleavage site between the NS4A and NS4B polypeptides to 
examine the effects on viral replication and processing 
of the polypeptide. 

Alternatively, one may delete all or part of a 
gene or of the 5' or 3' nontranslated region contained in 
an infectious nucleic acid sequence and then transfect a 
host cell (animal or cell culture) with the mutated 
sequence and measure viral replication in the host by 
methods known in the art such as RT-PCR. Preferred 
genes include, but are not limited to, the P7, NS4B and 
NS5A genes. Of course, those of ordinary skill in the 
art will understand that deletion of part of a gene, 
preferably the central portion of the gene, may be 
preferable to deletion of the entire gene in order to 
conserve the cleavage site boundaries which exist 
between proteins in the HCV polyprotein and which are 
necessary for proper processing of the polyprotein. 

In the alternative, if the transfection is 
3Q into a host animal such as a chimpanzee, one can monitor 
the virulence phenotype of the virus produced by 
transfection of the mutated infectious nucleic acid 
sequence by methods known in the art such as measurement 
of liver enzyme levels (alanine aminotransferase (ALT) 
or isocitrate dehydrogenase (ICD)) or by histopathology 
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of liver biopsies. Thus, mutations of the infectious 
nucleic acid sequences may be useful in the production 
of attenuated HCV strains suitable for vaccine use. 

The invention also relates to the use of the 
infectious nucleic acid sequence of the present 
invention to produce attenuated viral strains via 
passage in vitro or in vivo of the virus produced by 
transfection with the infectious nucleic acid sequence. 

The present invention therefore relates to the 
use of the nucleic acid sequence of the invention to 
identify cell lines capable of supporting the 
replication of HCV. 

In particular, it is contemplated that the 
mutations of the infectious nucleic acid sequence of the 
invention and the production of chimeric sequences as 
discussed above may be useful in identifying sequences 
critical for cell culture adaptation of HCV and hence, 
may be useful in identifying cell lines capable of 
supporting HCV replication. 

Transfection of tissue culture cells with the 
nucleic acid sequences of the invention may be done by 
methods of transfection known in the art such as 
electroporation, precipitation with DEAE-Dextran or 
calcium phosphate or liposomes. 

In one such embodiment, the method comprises 
the growing of animal cells, especially human cells, in 
vitro and transfecting the cells with the nucleic acid 
of the invention, then determining if the cells show 
indicia of HCV infection. Such indicia include the 
detection of viral antigens in the cell, for example, by 
immunofluorescence procedures well known in the art; the 
detection of viral polypeptides by Western blotting 
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using antibodies specific therefor; and the detection of 
newly transcribed viral RNA within the cells via methods 
such as RT-PCR. The presence of live, infectious virus 
particles following such tests may also be shown by 
5 injection of cell culture medium or cell lysates into 

healthy, susceptible animals, with subsequent exhibition 
of the signs and symptoms of HCV infection. 

Suitable cells or cell lines for culturing HCV 
include, but are not limited to, lymphocyte and 

10 

hepatocyte cell lines known in the art. 

Alternatively, primary hepatocytes can be 
cultured, and then infected with HCV; or, the hepatocyte 
cultures could be derived from the livers of infected 
15 chimpanzees. In addition, various immortalization 

methods known to those of ordinary skill in the art can 
be used to obtain cell lines derived from hepatocyte 
cultures. For example, primary hepatocyte cultures may 
^ be fused to a variety of cells to maintain stability. 

The present invention further relates to the 
IB. v^tro and in vivo production of hepatitis C viruses 
from the nucleic acid sequences of the invention. 

In one embodiment, the sequences of the 
15 invention can be inserted into an expression vector that 
functions in eukaryotic cells. Eukaryotic expression 
vectors suitable for producing high efficiency gene 
transfer in vivo are well known to those of ordinary 
skill in the art and include, but are not limited to, :; 
plasmids, vaccinia viruses, retroviruses, adenoviruses 
and ad^no-associated viruses. 

In another embodiment, the sequences contained 
in the recombinant expression vector can be transcribed 
5 in vitro by methods known to those of ordinary skill in 
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the art in order to produce RNA transcripts which encode 
the hepatitis C viruses of the invention. The hepatitis 
C viruses of the invention may then be produced by 
transfecting cells by methods known to those of ordinary 
skill in the art with either the in vitro transcription 
mixture containing the RNA transcripts or with the 
recombinant expression vectors containing the nucleic 
acid sequences described herein. 

The hepatitis C viruses produced from the 
sequences of the invention may be purified or partially 
purified from the transfected cells by methods known to 
those of ordinary skill in the art. In a preferred 
embodiment, the viruses are partially purified prior to 
15 their use as immunogens in the pharmaceutical 

compositions and vaccines of the present invention. 

The present invention therefore relates to the 
use of the hepatitis C viruses produced from the nucleic 
acid sequences of the invention as immunogens in live or 
killed (e.g. , formalin inactivated) vaccines' to prevent 
hepatitis C in a mammal. 

In an alternative embodiment, the immunogen of 
the present invention may be an infectious nucleic acid 
sequence, a chimeric nucleic acid sequence, or a mutated 
infectious nucleic acid sequence which encodes a 
hepatitis C virus. Where the sequence is a cDNA 
sequence, the cDNAs and their RNA transcripts may be 
used to transfect a mammal by direct injection into the 
liver tissue of the mammal as described in the Examples. 

Alternatively, direct gene transfer may be 
accomplished via administration of a eukaryotic 
expression vector containing a nucleic acid sequence of 
35 the invention. 
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In yet another embodiment, the immunogen may 
be a polypeptide encoded by the nucleic acid sequences 
of the invention. The present invention therefore also 
relates to polypeptides produced from the nucleic acid 
sequences of the invention or fragments thereof. In one 
embodiment, polypeptides of the present invention can be 
recombinantly produced by synthesis from the nucleic 
acid sequences of the invention or isolated fragments 
thereof, and purified, or partially purified, from 
transfected cells using methods already known in the 
art. In an alternative embodiment, the polypeptides may 
be purified or partially purified from viral particles 
produced via transfection of a host cell with the 
15 nucleic acid sequences of the invention. Such 

polypeptides might, for example, include either capsid 
or envelope polypeptides prepared from the sequences of 
the present invention. 

When used as immunogens, the nucleic acid 
sequences of the invention, or the polypeptides or 
viruses produced therefrom, are preferably partially 
purified prior to use as immunogens in pharmaceutical 
compositions and vaccines of the present invention. 
When used as a vaccine, the sequences and the 
polypeptide and virus products thereof, can be 
administered alone or in a suitable diluent, including, 
but not limited to, water, saline, or some type of 
buffered medium. The vaccine according to the present 
invention may be administered to an animal, especially a 
mammal, and most especially a human, by a variety of 
routes, including, but not limited to, intradermally, 
intramuscularly, subcutaneously, or in any combination 
thereof. 
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Suitable amounts of material to administer for 
prophylactic and therapeutic purposes will vary 
depending on the route selected and the immunogen 
(nucleic acid, virus, polypeptide) administered. One 
skilled in the art will appreciate that the amounts to 
be administered for any particular treatment protocol 
can be readily determined without undue experimentation. 
The vaccines of the present invention may be 
administered once or periodically until a suitable titer 
of anti-HCV antibodies appear in the blood. For an 
immunogen consisting of a nucleic acid sequence, a 
suitable amount of nucleic acid sequence to be used for 
prophylactic purposes might be expected to fall in the 
15 range of from about 100 ng to about 5 mg and most 

preferably in the range of from about 500 ng to about 
2mg. For a polypeptide, a suitable amount to use for 
prophylactic purposes is preferably 100 ng to 100 ng and 
20 for a virus 10^ to 10« infectious doses. Such 

administration will, of course, occur prior to any sign 
of HCV infection. 

A vaccine of the present invention may be 
employed in such forms as capsules, liquid solutions, 
suspensions or elixirs for oral administration, or 
sterile liquid forms such as solutions or suspensions. 
An inert carrier is preferably used, such as saline or 
phosphate-buffered saline, or any such carrier in which 
the HCV of the present invention can be suitably 
suspended. The vaccines may be in the form of single 
dose preparations or in multi-dose flasks which can be 
utilized for mass-vaccination programs of both animals 
and humans. For purposes of using the vaccines of the 
present invention reference is made to Remington's 
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Pharmaceu tical Sciences . Mack Publishing Co., Easton, 
Pa., Osol (Ed.) (1980); and New Trends and Developments 
in Vaccines , Voller et al. (Eds.), University Park 
Press, Baltimore, Md. (1978), both of which provide much 
5 useful information for preparing and using vaccines. Of 
course, the polypeptides of the present invention, when 
used as vaccines, can include, as part of the 
composition or emulsion, a suitable adjuvant, such as 
alum (or aluminum hydroxide) when humans are to be 
vaccinated, to further stimulate production of 
antibodies by immune cells. When nucleic acids, viruses 
or polypeptides are used for vaccination purposes, other 
specific adjuvants such as CpG motifs (Krieg, AvK. et 
15 al,(1995) and (1996)), may prove useful. 

When the nucleic acids, viruses and 
polypeptides of the present invention are used as 
vaccines or inocula, they will normally exist as 
physically discrete units suitable as a unitary dosage 
for animals, especially mammals, and most especially 
humans, wherein each unit will contain a predetermined 
quantity of active material calculated to produce the 
desired immunogenic effect in association with the 
15 required diluent. The dose of said vaccine or inoculum 
according to the present invention is administered at 
least once. In order to increase the antibody level, a 
second or booster dose may be administered at some time 
^ after the initial dose. The need for, and timing of, 

such booster dose will, of course, be determined within 
the sound judgment of the administrator of such vaccine 
or inoculum and according to sound principles well known 
in the art. For example, such booster dose could 
5 reasonably be expected to be advantageous at some time 
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between about 2 weeks to about 6 months following the 
initial vaccination. Subsequent doses may be 
administered as indicated. 

The nucleic acid sequences, viruses and 
polypeptides of the present invention can also be 
administered for purposes of therapy, where a mammal, 
especially a primate, and most especially a human, is 
already infected, as shown by well known diagnostic 
measures. When the nucleic acid sequences, viruses or 
polypeptides of the present invention are used for such 
therapeutic purposes, much of the same criteria will 
apply as when it is used as a vaccine, except that 
inoculation will occur post-infection. Thus, when the 
15 nucleic acid sequences, viruses or polypeptides of the 
present invention are used as therapeutic agents in the 
treatment of infection, the therapeutic agent comprises 
a pharmaceutical composition containing a sufficient 
amount of said nucleic acid sequences, viruses or 
polypeptides so as to elicit a therapeutically effective 
response in the organism to be treated. Of course, the^ 
amount of pharmaceutical composition to be administered 
will, as for vaccines, vary depending on the immunogen 
contained therein (nucleic acid, polypeptide, virus) and 
on the route of administration. 

The therapeutic agent according to the present 
invention can thus be administered by .subcutaneous, 
intramuscular or intradermal routes. One skilled in the 
art will certainly appreciate that the amounts to be 
administered for any particular treatment protocol can 
be readily determined without undue experimentation. Of 
course, the actual amounts will vary depending on the 
route of administration as well as the sex, age, and 
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clinical status of the subject which, in the case of 
human patients, is to be determined with the sound 
judgment of the clinician. 

The therapeutic agent of the present invention 
can be employed in such forms as capsules, liquid 
solutions, suspensions or elixirs, or sterile liquid 
forms such as solutions or suspensions. An inert carrier 
is preferably used, such as saline, phosphate-buffered 
saline, or any such carrier in which the HCV of the 
present invention can be suitably suspended. The 
therapeutic agents may be in the form of single dose 
preparations or in the multi-dose flasks which can be 
utilized for mass-treatment programs of both animals and 
humans. Of course, when the nucleic acid sequences,^ 
viruses or polypeptides of the present invention are 
used as therapeutic agents they may be administered as a 
single dose or as a series of doses, depending on the 
situation as determined by the person conducting the 
treatment. 

The nucleic acids, polypeptides and viruses of 
the present invention can also be utilized in the 
production of antibodies against HCV. The term 
"antibody" is herein used to refer to immunoglobulin 
molecules and immunologically active portions of 
immunoglobulin molecules. Examples of antibody 
molecules are intact immunoglobulin molecules, 
substantially intact immunoglobulin molecules and 
portions of an immunoglobulin molecule, including those 
portions known in the art as Fab, F(ab')2 and F(v) as 
well as chimeric antibody molecules. 

Thus, the polypeptides, viruses and nucleic 
acid sequences of the present invention can be used in 
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the generation of antibodies that imraunoreact (i.e., 
specific binding between an antigenic determinant- 
containing molecule and a molecule containing an 
antibody combining site such as a whole antibody 
molecule or an active portion thereof) with antigenic 
determinants on the surface of hepatitis C virus 
particles. 

The present invention therefore also relates 
to antibodies produced following immunization with the 
nucleic acid sequences, viruses or polypeptides of the 
present invention. These antibodies are typically 
produced by immunizing a mammal with an immunogen or 
vaccine, to induce antibody molecules having 
immunospecificity for polypeptides or viruses produced 
in response to infection with the nucleic acid sequences 
of the present invention. When used in generating such 
antibodies, the nucleic acid sequences, viruses, or 
polypeptides of the present invention may be linked to 
some type of carrier molecule. The resulting antibody 
molecules are then collected from said mammal. 
Antibodies produced according to the present invention 
have the unique advantage of being generated in response 
25 to authentic, functional polypeptides produced according 
to the actual cloned HCV genome. 

The antibody molecules of the present 
invention may be polyclonal or monoclonal. Monoclonal 
antibodies are readily produced by methods well known in 
the art. Portions of immunoglobin molecules, such as 
Fabs, as well as chimeric antibodies, may also be 
produced by methods well known to those of ordinary 
skill in the art of generating such antibodies. 
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The antibodies according to the present 
invention may also be contained in blood, plasma, serum, 
hybridoma supernatants, and the like. Alternatively, 
the antibody of the present invention is isolated to the 
5 extent desired by well known techniques such as, for 

example, using DEAE Sephadex. The antibodies produced 
according to the present invention may be further 
purified so as to obtain specific classes or subclasses 
of antibody such as IgM, IgG, IgA, and the like. 

10 

Antibodies of the IgG class are preferred for purposes 
of passive protection. 

The antibodies of the present invention are 
useful in the prevention and treatment of diseases 
15 caused by hepatitis C virus in animals, especially r 
mammals, and most especially humans. : 

In providing the antibodies of the present 
invention to a recipient mammal, preferably a human, the 
dosage of administered antibodies will vary depending on 
such factors as the mammal's age, weight, height, sex, 
general medical condition, previous medical history, and 
the like. 

In general, it will be advantageous to provide 
25 the recipient mammal with a dosage of antibodies in the 
range of from about 1 mg/kg body weight to about 10 
mg/kg body weight of the mainmal, although a lower or 
higher dose may be administered if found desirable. 
Such antibodies will normally be administered by 
intravenous or intramuscular route as an inoculum. The 
antibodies of the present invention are intended to be 
provided to the recipient subject in an amount 
sufficient to prevent, lessen or attenuate the severity, 
35 extent or duration of any existing infection. 
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The antibodies prepared by use of the nucleic 
acid sequences, viruses or polypeptides of the present 
invention are also highly useful for diagnostic 
purposes. For example/ the antibodies can be used as in 
5 vitro diagnostic agents to test for the presence of HCV 
in biological samples taken from animals, especially 
humans. Such assays include, but are not limited to, 
radioimmunoassays, EIA, fluorescence. Western blot 
analysis and ELISAs. In one such embodiment, the 
biological sample is contacted with antibodies of the 
present invention and a labeled second antibody is used 
to detect the presence of HCV to which the antibodies 
are bound. 

15 Such assays may be, for example, direct where 

the labeled first antibody is immunoreactive with the 
antigen, such as, for example, a polypeptide on the 
surface of the virus; indirect where a labeled second 
antibody is reactive with the first antibody; a 
competitive protocol such as would involve the addition 
of a labeled antigen; or sandwich where both labeled and 
unlabeled antibody are used, ds well as other protocols 
well known and described in the art. 
'5 In one embodiment, an immunoassay method would 

utilize an antibody specific for HCV envelope 
determinants and would further comprise the steps of 
contacting a biological sample with the HCV-specific 
^ antibody and then detecting the presence of HCV material 
in the test sample using one of the types of assay 
protocols as described above. Polypeptides and 
antibodies produced according to the present invention 
may also be supplied in the form of a kit, either 
5 present in vials as purified material, or present in 
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compositions and suspended in suitable diluents as 
previously described. 

In a preferred embodiment, such a diagnostic 
test kit for detection of HCV antigens in a test sample 
5 comprises in combination a series of containers, each 
container a reagent needed for such assay. Thus, one 
such container would contain a specific amount of HCV- 
specific antibody as already described, a second 
container would contain a diluent for suspension of the 

10 

sample to be tested, a third container would contain a 
positive control and an additional container would 
contain a negative control. An additional container 
could contain a blank. 
15 For all prophylactic, therapeutic and 

diagnostic uses, the antibodies of the invention and 
other reagents, plus appropriate devices and 
accessories, may be provided in the form of a kit so as 
to facilitate ready availability and ease of use. 

20 

The present invention also relates" to the use 
of nucleic acid sequences and polypeptides of the 
present invention to screen potential antiviral agents 
for antiviral activity against HCV. Such screening 

25 methods are known by those of skill in the art. 

Generally, the antiviral agents are tested at a variety 
of concentrations, for their effect on preventing viral 
replication in cell culture systems which support viral 
replication, and then for an inhibition of infectivity 
or of, viral pathogenicity (and a low level of toxicity) 
in an animal model system. 

In one embodiment, animal cells (especially 
human cells) transfected with the nucleic acid sequences 

35 of the invention are cultured in vitro and the cells are 
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treated with a candidate antiviral agent (a chemical, 
peptide etc.) by adding the candidate agent to the 
medium. The treated cells are then exposed, possibly 
under transfecting or fusing conditions known in the 
art, to the nucleic acid sequences of the present 
invention. A sufficient period of time would then be 
allowed to pass for infection to occur, following which 
the presence or absence of viral replication would be 
determined versus untreated control cells by methods 
known to those of ordinary skill in the art. Such 
methods include, but are not limited to, the detection 
of viral antigens in the cell, for example, by 
immunofluorescence procedures well known in the art; the 
15 detection of viral polypeptides by Western blotting 
using antibodies specific therefor; the detection of 
newly transcribed viral RNA within the cells by RT-PCR; 
and the detection of the presence of live, infectious 
virus particles by injection of cell culture medium or 
cell lysates into healthy, susceptible animals, with 
subsequent exhibition of the signs and symptoms of HCV 
infection. A comparison of results obtained for control 
cells (treated only with nucleic acid sequence) with 
those obtained for treated cells (nucleic acid sequence 
and antiviral agent) would indicate, the degree, if any, 
of antiviral activity of the candidate antiviral agent. 
Of course, one of ordinary skill in the art would 
readily understand that such cells can be treated with 
the candidate antiviral agent either before or after 
exposure to the nucleic acid sequence of the present 
invention so as to determine what stage, or stages, of 
viral infection and replication said agent is effective 
35 against. 
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In an alternative embodiment, viral enzyme 
such as NS3 protease, NS2-NS3 protease, NS3 helicase or 
NS5B RNA polymerase may be produced from a nucleic acid 
sequence of the invention and used to screen for 
5 inhibitors which may act as antiviral agents. The 

structural and nonstructural regions of the HCV genome, 
including nucleotide and amino acid locations, have been 
determined, for example, as depicted in Houghton, M. 
(1996), Fig. 1; and Major, M.E. et al. (1997), Table 2. 

10 

Such above-mentioned protease inhibitors may 
take the form of chemical compounds or peptides which 
mimic the known cleavage sites of the protease and may 
be screened using methods known to those of skiM in the 
15 art (Houghton, M. (1996) and Major, M.E. et al. (199'7)). 
For example, a substrate may be employed which mimics 
the protease 's natural substrate, but which provides a 
detectable signal (e.g. by fluorimetric or colorimetric 
methods) when cleaved. This substrate is then incubated 

20 

with the protease and the candidate protease' inhibitor 
under conditions of suitable pH, temperature etc. to 
detect protease activity. The proteolytic activities of 
the protease in the presence or absence of the candidate 

25 inhibitor are then determined. 

In yet another embodiment, a candidate 
antiviral agent (such as a protease inhibitor) may be 
directly assayed in vivo for antiviral activity by 
administering the candidate antiviral agent to a 
chimpanzee transfected with a nucleic acid sequence of 
the invention or infected with a virus of the invention 
and then measuring viral replication in vivo via methods 
such as RT-PCR. Of course, the chimpanzee may be 

35 treated with the candidate agent either before or after 
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transfection with the infectious nucleic acid sequence 
or infected with a virus of the invention so as to 
determine what stage, or stages, of viral infection and 
replication the agent is effective against. 

The invention also provides that the nucleic 
acid sequences, viruses and polypeptides of the 
invention may be supplied in the form of a kit, alone or 
in the form of a pharmaceutical composition. 

All scientific publication and/or patents 
cited herein are specifically Incorporated by reference. 
The following examples illustrate various aspects of the 
invention but are in no way intended to limit the scope 
thereof.. 



EXAMPLES 

Materials and Methods 

Source of HCV 

An infectious plasma pool of HCV genotype 2a 
(HC-J6ch) prepared from acute phase plasma of a 
chimpanzee experimentally inoculated with plasma from a 
Japanese patient infected with strain HC-J6 (Okamoto et 
al., 1991) was used for cloning. An infectious cDNA 
clone of HCV strain H77, genotype la was also used 
(PCV-H77C; Yanagi et al., 1997). 

Amplification, cloning and sequence analysis 

Viral RNA was extracted from 100 fil aliquots 
of the HC-J6cH plasma pool with the TRIzol system 
(GIBCO/BRL) (Yanagi et al., 1997). Primers used in cDNA 
synthesis and PGR amplification were based on the 
genomic sequence of strain HC-J6 (Okamoto et al., 1991) 
2j and from the conserved region (3'X) of the 3' UTR of HCV 
genotype 2a (Tanaka et al., 1996) (Table 1). The RNA 



NSDOCID: <W0 0Q7S338A2J > 



wo 00/7S338 



PCT/USOO/15446 



- 30 - 

o . 

was denatured at GS'C for 2 min, and cDNA was 
synthesized at 42°C for 1 hour with Superscript II 
reverse transcriptase (GIBCO/BRL) and specific reverse 
primers in 20 reaction volumes. The cDNA mixtures 
^ were treated with RNase H and RNase Tl (GIBCO/BRL) at 
37»C for 20 min. 

TABLE 1 

Oligonucleotides used for amplification and cloning 
10 of strain HC-J6ch, genotype 2a 



20 



25 



Designation 


Sequence (5' — > 3') a 


Oil OT C—UTT 


ACTGGACACGGAGGTGGCCGCGTC 


2426S-H77 


TTGTTCTTGTCGGGTTAATGGCGC 


2645R-H77 


GGGTGTACTACAGRCATGAGTAAG 


2832R-H77 


AAGCGCCCCTAACTGATGATG 


H2751SII 


CGTCATCGA.TACCTCAGCGGGCATATGCACTGGACACGGA 


H2785R 


GTCCAGTGCATATGCCCGCTGAGG 


H2870R 


CATGCACCAGCTGATATAGCGCTTGTAATATG 


H7851S 


TCCGTAGAGGAAGCTTGCAGCCTGACGCCC 


H9140S {M) 


CAGAGGAGGCAGGGTGCTATATGTGGCAAGTAC 


H9173R (M) 


GTACTTGCCACATATAGCAGCCCTGCCTCCTCTG 


H9471R 


CGTCTCTAGACAGGAAATGGCTTAAGAGGCCGGAGTGTTTACC 


J6-H2556S 


TTATGGATGCrCATCTTGTTGGGCCAGGCCGAAGCAGCTTTGGAGflACrTWWaATAfrr 




CAATGC 


356RF-J6H 


AGGATTTGTGCTCATGGTGCACGGTCTACGAG 


1S-J6B* 


TTTTTTTTGOGGCCGCraAaSaCGSMICRCTArASRCCCGCCCCTAATAGr; 


333S-J6 


C03TGCACCATGAGCACAAATCCTAAACCTC 


753R-J6 


GGATGTACCCCATGAGGTCGGCAAAG 


2543S-J6F 


GTTTGCGCCTGCTTATGGATGCTCATCTTG 


2787R-J6{26) 


GCGTCATAAGCATATGCCTGTTGGGG 


3329R-J6 


CCCTCAGCACTGGAGTACATCTG 


5487-J6F 


C6TCA.T6CA.TACCCCTAGGGCGGCTCTCATTGAAGAGGG 


5518R-J6F 


CGTCCCCTCTTCAATGAGAGCCGCTCTAGA 


9251S-J6F 


GCGGTGAAGACCAAGCTCAAACTCACTC 


9305R-J6F 


AATCTAGAAGGCGCGCTTCCGGCAATGGAGTGARTTTRARr: 


9310R-J6F 


C6TCTCTASAGGATAAATCCAGGAGGCGCGCTTCCGGC 


9399S-J6F 


TACTTTTTGTAGGGGTAGGCCTTTTCC 


9464-J6F 


CGTCTCTAGAGTGTAGCTAATGTGTGCCGCTCTA 


9470{24)-J6 


CTATGGAGTGTAGCTAATGTGTGC 


J6-3' XR 


CGTCTCTAGACATGATCTGCAGAGAGACCAGTTACGGCACTCTCTGFCAGTCATGCGGC 




TCACGGACCTTTCACAGCTAGCCGTGACTAGGGCTAAGATGGAGCCACC 



a HCV-specific sequences are shown in plain text, non HCV-specific 
30 sequences are shown in bold face, and cleavage sites used for cDNA 

cloning are underlined. 
b The core sequence of the T7 promotor is shown in italics. 



35 



The strategy used to amplify and clone the 
full-length HC-J6ch sequence is shown in Fig. 1. 
Nucleotide positions correspond to those of the 2a 
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infectious clone (pJGCF) that is described herein. The 
5' end of HC-J6ch (nts. 17-297, excluding primer 
sequences) was amplified from 2 jil of cDNA synthesized 
with primer a-2 (Yanagi et ai., 1996). PGR was performed 
with AmpliTaq Gold DNA polymerase (Perkin-Elraer) as 
described previously (Yanagi et al., 1996) using primers 
1S-J6F and a-2. After purification, the amplified 
products were cloned into pGEM-T Easy vector (Promega) 
using standard procedures and 5 clones (pJ6-5'UTR) were 
sequenced. 

The 3' end of HC-J6ch was amplified in 3 
overlapping pieces. RT-PCR of a short fragment of NS5B 
(nts. 9279-9439) was performed with primers 9251S-J6F 
and 9464R-J6F as described above. The PGR products were 
cloned into pGEM-T Easy vector and sequence analysis was 
performed from 5 pJ6-3'F clones. A second region 
spanning from NS5B to the conserved region of the 3' UTR 
(nts. 9376-9629) was amplified in RT-nested PGR 
(external primers H9261F and H3'X58R, internal primers 
H9282F and H3'X45R) .{Yanagi et al., 1997). The amplified 
products were cloned into pGEM-9zf(-) by using Hlndlll 
and Xbal sites and 14 pJ6-3'VR clones were sequenced. 
The third fragment, which included the 3' terminal 
sequence was amplified with primers 9399S-J6F and 
J6-3'XR from one of the pJ6-3'VR clones, and cloned into 
one of the pJ6-3'F clones by using StuI and Xbal sites 
(pJ6-3'X). 

The ORF of HCV HC-J6cH was amplified by long 
RT-PCR in 3 overlapping pieces. The amplification was 
performed on 2 ^il of the cDNA mixtures with the 
Advantage cDNA polymerase mix (Clontech) (Yanagi et al., 
1997) . The J6S fragment (nts. 86-2761) was amplified 
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with primers a-1 (Yanagi et al., 1996} and J6-2787R from 
cDNA synthesized with primer J6-3329R, A single PGR 
round was performed in a Robocycler thermal cycler 
(Stratagene) , and consisted of denaturation at 99"*C for 
5 35 sec, annealing at 67°C for 30 sec and elongation at 
eS'C for 4 min 30 sec during the first 5 cycles, 5 min 
during the next 10 cycles, 5 min 30 sec during the 
following 10 cycles and 6 min during the last 10 cycles. 

IQ The J6B fragment (nts. 2573-5488) was amplified with 
primers 2543S-J6F and 5518R-J6F from cDNA synthesized 
with primer 5518R-J6F. Finally, the J6A fragment (nts. 
5515-9282) was amplified with primers 5487S-J6F and 
9310R-J6F from cDNA synthesized with primer 
9470R(24) -J6F. PGR amplifications of fragments J6B and 
J6A consisted of denaturation at 99°C for 35 sec, 
annealing at 67'C fpr 30 sec and elongation at 68°C for 6 
min during the first 5 cycles, 7 min during the next 10 

20 cycles, 8 min during the following 10 cycles, and 9 min 
during the last 10 cycles. 

After purification of the long PGR products 
with QIAquick PGR purification kit (QIAGEN) , A-tailing 

22 reactions were performed with AmpliTaq DNA polymerase 
(Perkin Elmer) at 72 "G for 1 hour. The gel-purified 
A-tailed PGR products were cloned into pCR2.1 vector 
(Invitrogen) or pGEM-T Easy vector (Promega) . DH5-alpha 
competent cells (GIBCO BRL) were transformed and 

30 

selected on LB agar plates containing 100 ng/ml 
ampicillin (SIGMA) and amplified in LB liquid cultures 
at 30°C for 18 - 20 hrs (Yanagi et al., 1997). Midiprep 
was performed using Wizard Plus Midipreps DNA 

35 
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Purification System (Promega) . Multiple clones of the 

0 

J6S, J6A and the J6B fragments were sequenced. 

The consensus sequence of strain HC-J6ch (nts. 
17-9629) was determined by direct sequencing of PGR 
products (nts. 297-3004 and nts. 4893-5762) and by 
5 sequence analysis of the TA clones (nts. 17-5488 and 
nts. 5515-9629) (Fig. 1). Both strands of DNA were 
sequenced in all cases. Analyses of genomic sequences, 
including multiple sequence alignments and tree 
analyses, were performed with GeneWorks (Oxford 
Molecular Group) (Bukh et al., 1995).. 

Construction of chimeric cDNA clones of genotypes la & 
2a 

Four full-length intertypic chimeric cDNA 
clones were constructed (Figs. 4, 5A, 5B) . In each clone 
the C, El and E2 genes encoded the consensus amino acid 
sequence of HC-J6ch. The p7 protein was encoded either by 
the HC-J6cH or pCV-H77C consensus sequence, and the NS 
20 proteins were all encoded by pCV-H77C genes. To 

engineer these cDNA clones, an Wdel site from pCV-H77C 
was first eliminated by a silent substitution (C to T) 
at position 9158. In brief, two fragments were 
amplified from pCV-H77C with primers H7851S and 

25 

H9173R(M) and with primers H9140S(M) and H9417R (Table 
3), gel-purified and used for fusion PGR with primers 
H7851S and H9417R. The fusion PGR products were cloned 
into pGV-H77C by using Hindlll and Afill sites. A new 
30 artificial Wdel site was introduced by a silent 

substitution (C to T) at position 2765. PGR products, 
which were amplified from pGV-H77C with primer H2751SII 
containing artificial Clal and Ndel sites and primer 
H2870R, were cloned into the modified pGV-H77C by using 
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Clal and i;co47lli sites. The final construct (pH77CV) 

was used as a cassette vector to construct the 

intertypic chimeric HCV cDNA clones. 

The four chimeric cDNA clones were constructed 

5 as follows. PH77CV-J5S (nucleotide sequence shown in 

SEQ ID No: 3 and amino acid sequence shown in SEQ ID 

No: 4): The Agel/Bsml fragment of clone J6S2 and the 

Bsml/Ndel fragment of clone J6S1, were cloned into 

pH77CV by using Agel and Ndel sites; pH77 (p7)CV-J6S 
10 , , 

(nucleotide sequence shown in SEQ ID No: 5 and amino acid 
sequence shown in SEQ ID No: 6): A fragment of pH77CV-J6S 
was replaced with a fragment amplified from pCV-H77C 
with primers J6-H2556S and H2786R by using BsaBl and 
15 Ndel sites; J6S (nucleotide sequence shown in SEQ ID 
No:7 and amino acid sequence shown in SEQ ID No:8): A 
fragment amplified from pH77pCV-H77C with primers a-1 
and 356RF-J6H77 and another fragment amplified from 
PH77CV-J6S with primers 333S-J6 and 753R-J6 were 

20 

gel-purified and a fusion-PCR was performed with primers 
a-1 and 753R-J6. The Agel/Clal fragment of the 
subcloned fusion PGR products and the Clal /Ndel fragment 
of pH77CV-J6S were cloned into pH77CV-J6S by using Agel 

25 and Ndel sites; pH77(p7)-J6S (nucleotide sequence shown 
in SEQ ID No: 9 and amino acid sequence shown in SEQ ID 
No: 10): The Agel/Clal fragment of J6S and the Clal/Ndel 
fragment of (p7)CV-J6S were cloned into pH77 (p7) CV-J6S 

^ by using Agrel and Ndel sites. 

Each intertypic chimeric cDNA clone was 
retransformed to select a single clone, and large-scale 
preparation of plasmid DNA was performed with a QIAGEN 
plasmid Maxi kit as described previously (Yanagi et al., 

35 1997) . Each of the four cDNA clones was completely 
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sequenced before inoculation- Each clone was 
genetically stable since the digestion pattern was as 
expected following retransforraation and the complete 
sequence was the expected one. 

Construction of full-length cDNA clone HC-J6ch 

An overview of the full-length HC-J6ch clone is 
presented in Fig, 1. in the final construct pJ6CF, 
which encodes the consensus polyprotein of HC-J6ch, an 
Xbal site was eliminated by a silent substitution (A to 
G) at position 54 94. Digested fragments containing the 
consensus sequence were purified from the appropriate 
subclones and ligated using the sites indicated. The 
full-length cDNA clone (pJ6CF) was retransf ormed to 
select a single clone, and large-scale preparation of 
plasmid DNA followed by the complete sequence analysis 
was performed. Clone pJ6CF was genetically stable. 

Intrahepatic transfection of chimpanzee with transcribed 
20 RNA 

In duplicate 100 nl reactions, RNA was 
transcribed in vitro with T7 RNA polymerase (Proraega) 
from 10 ng of template plasmid linearized with Xbal 
25 (Promega) as described previously (Yanagi et aJ., 1997). 
The integrity of the RNA was checked by electrophoresis 
through agarose gel stained with ethidium bromide 
(Yanagi et al., 1997). Each transcription mixture was 
diluted with 400 \xl of ice-cold phosphate-buffered 
saline without calcium or magnesium and then immediately 
frozen on dry ice and stored at -80°C. Within 24 hours, 
both transcription mixtures were injected into the same 
chimpanzee by percutaneous intrahepatic injection guided 
by ultrasound (Yanagi et al., 1998, 1999). If the 
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chimpanzee did not become infected, the same 
trans faction was repeated once. After two negative 
results, the next clone was inoculated into the same 
chimpanzee following the same protocol. Injections were 
5 performed at weeks 0 and 2 with pH77CV-J6S, at weeks 5 
and 8 with pH77 (p7 ) CV-J6S, at weeks 14 and 16 with 
PH77-J6S, at weeks 19 and 23 with pH77 (p7) -J6S, at week 
28 with.pJSCF, and finally at week 34 with pCV-H77C. 
The chimpanzee was maintained under conditions that met 

10 

or exceeded all requirements for its use in an approved 
facility. 

Serum samples were collected weekly from, the 
chimpanzee and monitored for liver enzyme levelSivby , 
15 standard procedures, anti-HCV antibodies by the 

second-generation ELISA (Abbott) and HCV RNA by a 
sensitive RT-nested PGR assay with AmpliTaq Gold DNA 
polymerase using primers from the 5' UTR (Yanagi et ai. , 
1995) . Samples were scored as negative for HCV RNA if 

20 

two independent tests on 100 |j.l of serum were negative. 
The genome equivalent (GE) titer of HCV in positive 
samples was determined by RT-nested PCR on 10-fold 
serial dilutions of the extracted RNA (Bukh et aJ., 

25 1998) . The consensus sequence of the complete ORF from 
the chimpanzee infected with RNA transcripts of pJ6CF 
was determined by direct sequencing of overlapping PCR 
products obtained by long RT-nested PCR as previously 

jQ described (Yanagi et al., 1997) with HC-J6 specific 

primers. After the intrahepatic transfection with RNA 
transcripts of pCV-H77C, we performed H77 (genotype la)- 
specific RT-nested PCR with primers 2427S-H77 and 
2832R-H77 for the 1st round and with primers 24 62S-H77 
and 2645R-H77 for the 2nd round (Table 3) . The 
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sensitivity of this assay was equivalent to that of the 
assay using 5' UTR primers when testing serum containing 
• only H77, genotype la. The genome titer of genotype la 
was determined by using this specific RT- nested PGR on 
10-fold serial dilutions of the extracted RNA. 

EXAMPLE 1 

Sequence analysis of HCV strain HC-J6ch 

As minor deviations from the consensus amino 
acid sequence, were found previously to render 
full-length HCV cDNA clones noninfectious (Yanagi et 
ai., 1997, 1998), the consensus sequence of the cloning 
source of genotype 2a (strain HC-J6ch) was determined 
prior to constructing any full-length clones. In brief, 
a plasma pool containing strain HC-J6ch was prepared from 
acute phase plasmapheresis units collected from a 
chimpanzee experimentally infected with HC-J6 (Okamoto 
et al., 1991). The HCV genome titer of this pool was 
10^-* genome equivalents (GE) /ml (Quantiplex' HCV RNA 
bDNA 2.0, Chiron) and the infectivity titer was 10^ 
chimpanzee infectious doses/ml. 

The consensus sequence of the 5' UTR of HC-J6ch 
(nts. 17-340) was deduced from 5 clones containing nts. 
17-297 and 8 clones containing nts. 86-340. The 5' UTR 
of the various clones was highly conserved, but the 
consensus sequence of HC-J6ch differed by 2 nucleotides 
2Q from that published previously for HC-J6 (Okamoto et 

al., 1991: C to T at position 36 and T to C at position 
222) . 

The consensus sequence of 14 clones of the 3' 
UTR of HC-J6ch indicated that the 39 nucleotide long 
variable region was highly conserved in this strain and 



20 



25 



35 



wo 00/75338 



PCT/USOO/15446 



- 38 - 

o 

was identical to that previously published for HC-J6 
(Okatnoto et al., 1991). The polypyrimidine tract varied 
greatly in length (84-164 nucleotides), and contained 
some conserved A residues. In the conserved region, the 
5 proximal 16 nucleotides were identical to those 

previously published for isolates of different HCV 
genotypes (Kolykhalov et ai., 1996; Tanaka et al., 1996; 
Yamada et al., 1996). The remaining 82 nucleotides of 
the conserved region were determined for other genotype 
2a strains (Tanaka et al., 1996) but not for HC-J6 or 
HC— J6cH • 

The ORF of HC-J6cH was amplified in 3 fragments 
by RT-PCR (Fig. 1). Eight clones of the J6S fragment 
(nts. 86-2761), 6 clones of the J6B fragment (nts. ^ 
2573-5488) and 6 clones of the J6A fragment (nts. 
5515-9298) were sequenced. PGR fragments containing 
nts. 5489-5514 werei sequenced directly. A quasispecies 
was found at 243 nucleotide (2,7%) and 69 amino acid 
(2.3%) positions, scattered throughout the 9099 nts 
(3033 aa) of the ORF. However, the majority, 231 
nucleotide substitutions, were detected only once and 
71.6 % of these represented silent mutations. The 12 
remaining nucleotide substitutions were each restricted 
to 2 clones and only 4 of these resulted in amino acid 
changes. The nucleotide difference among the J6S clones 
ranged from 0.1 - 1.3%, among the J6B clones it ranged 
from 0.1 - 0.3%, and it ranged from 0.2 - 4.0% among the 
J6A clones (Fig. 2) . Three of 8 J6S clones, 4 of 6 J6B 
clones, and all 6 J6A clones had defective polyproteins 
due to nucleotide deletions, insertions or 
substitutions. 
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The sequences of clones of strain HC-J6ch were 

relatively homogeneous. This was highlighted by the 

high degree of conservation among clones of the HVRl 

(Fig. 3), a region frequently used to study the 

5 quasispecies of HCV (Bukh et al., 1995). An exception 

was the sequence of clone J6A1, which differed by about 

4% from the other clones of this region (Fig. 2) . 

Importantly, the consensus sequence of strain HC-J6ch 

(nts. 17-9629) could be determined with no ambiquitv at 
10 ^ 

the nucleotide or deduced amino acid level. The 
difference between the consensus ORF sequence of HG-J6ch 
from the experimentally infected chimpanzee and that of 
HC-J6 of the inoculum (Okamoto et al., 1991) was 4.1 % 

15 and 2.2 % at the nucleotide and deduced amino acid 

levels, respectively (Fig. 2, Table 2). Moreover, we 
found that 12 (44.4%) of the 27 amino acids constituting 
HVRl differed between HC-J6ch and HC-J6 (Fig. 3). Such 

20 diversities are greater than the < 2 % generally 

considered to comprise a quasispecies. In fact, these 
differences are equivalent to those found between the 
two prototype strains of HCV genotype la (strains HCV-1 
(Choo et al., 1991) and H77 (Yanagi et al., 1997)]. 

IS These results indicated that HC-J6ch, which represented 
the major species in the experimentally infected 
chimpanzee, was a minor species in the original 
inoculum. 
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TABLE 2 



Percent difference of nucleotide and predicted amino acid sequences 
between strain HC-J6 {Okamoto et ai,, 1991) and strain HC-J6ch from 
acute phase plasma pool of a chimpanzee inoculated with HC-J6 



5 



Genome Region 


nt. position^ 


% nt. 


difference 


% a. 


a . 


difference 


ORF 


341-9439 


4.1 


(373/9099)" 


2. 


2 


(66/3033)* 


5' UTR 


17-340 


0.6 


(2/324) 








Core 


341-913 


0.5 


(3/573) 


0 


(0/191) 


El. 


914-1489 


4.3 


(25/576) 


2. 


1 


(4/192) , 


HVRl 


1490-1570 


24.7 


(20/81) 


44. 


4 


(12/27) 


E2-HVR1 


1571-2590 


3.9 


(40/1020) 


3. 


2 


(11/340) 


p7 


2591-2779 


3.7 


(7/189) 


3. 


2 


(2/63) 


NS2 


2780-34 30 


4.0 


(26/651) 


2. 


8 


(6/217) 


NS3 


3431-5323 


4.0 


(76/1893) 


0. 


8 


(5/631) 


NS4A 


5324-5485 


4.3 


(7/162) 


1. 


9 


(1/54) 


NS4B 


54 86-6268 


3.7 


(29/783) 


0. 


4 


(1/261) 


NS5A 


6269-7666 


5.4 


(75/1398) 


3. 


4 


(16/466) 


NS5B 


7667-9439 


3.7 


(65/1773) 


1. 


4 " (8/591) 


3' UTR 


9440-9481 


0 (0/42) 









a The nucleotide positions correspond to those of the infectious 

full-length genotype 2a clone (pJ6CF) . 
b The numbers in parenthesis indicate the nucleotide or amino acid 

differences for each region. 



Example 2 

Chimeric molecular clones 

As chimeric flaviviruses with substituted 
structural genes have been useful in defining the 
biological function of viral sequences or proteins, in 
analyzing immune responses and in generating attenuated 
vaccine candidates (Bray and Lai, 1991; Chambers et al., 
1999; Pletnev et al., 1992, 1993, 1998). The consensus 
sequence of the 2a structural genes and surrounding 
region was substituted for that of the infectious la 
cDNA clone. In the genotype la backbone, two silent 
mutations were introduced for cloning purposes [at 
positions 2765 (p7) and 9158 (NS5B) of pCV-H77C] {Fig. 
4). The complete sequence of each chimera was verified. 
Infectivity of RNA transcripts from four different 
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intertypic chimeric clones (Figs. 4, 5A, 5B) was evaluated 
by consecutive intrahepatic transfections of a chimpanzee. 
Clones were considered not to be viable if viral RNA was not 
detected in the serum within two weeks of the repeat 
transfection. All chimeric clones contained the C, El and E2 
genes of genotype 2a. The two chimeric clones tested 
initially differed from each other in that one had the p7 
gene of 2a (pH77CV-J6S) and the other [pH77 (p7) CV-J6S] the 
p7 gene of la. They differed from the two other clones in 
that the 186 nucleotides of the 5' UTR just upstream of the 
initiation codon were from the 2a genotype. Since neither 
clone containing the chimeric 5' UTR was Infectious, the 
chimeric 5' UTR was replaced with the consensus genotype la 
5' UTR to generate the two p7 varieties [pH77-J6S and 
15 pH77 (p7) -J6S] . After consecutive transfection of the four 
clones, no HCV RNA, anti-HCV or ALT elevation was detected 
in the chimpanzee during 28 weeks of follow-up, suggesting 
that RNA transcripts from these intertypic chimeric clones 
were not viable in vivo. 

This finding that the intertypic clones between 
genotypes la and 2a were not viable was surprising since 
flavivirus chimeras containing the structural region of 
dengue virus type 1 or 2 or of tick-borne encephalitis virus 
and the nonstructural region of an infectious dengue type 4 
virus were viable (Bray and Lai, 1991; Pletnev et al., 1992, 
1993) . While considerable sequence variation exists between 
the infectious genotype la and 2a clones of HCV (Table 3), 
2Q these viruses exhibit a higher degree ' of genetic 

heterogeneity than do the major genotypes of HCV. For other 
f laviviruses, however, it was possible to obtain 
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infectious chimeric clones only if the capsid region was 
derived from the backbone cDNA clone (Chambers et al., 
1999; Pletnev and Men, 1998). 



TABLE 3 



10 



15 



20 



Percent difference of the amino acid sequences between 
the infectious clone of genotype la (pCV-H77C; 
Yanagi et al., 1997) and the infectious clone of 



Genome Region* 


% difference 


Polyp rotein 


27. 9 (839/3007)" 


Core 


8.9 (17/191) 


El 


37.0 (71/192) 


HVRl 


59.3 (16/27) 


E2-HVRI 


27.1 (91/336) 


P7 


38.1 (24/63) 


NS2 


41.9 (91/217) 


NS3 


19.2 (121/631) 


NS4A 


33 . 3 (18/54) 


NS4B 


26.8 (70/261) 


NS5A 


38.5 (171/444) 


NS5B 


25.2 (149/591) 



a Genome regions defined as in Table 1. 

b The numbers in parenthesis indicate the amino 

acid differences for each region. 

Positions with deletions or insertions in E2 (4 

aa positions) and NS5A (26 aa positions) were 

not considered. 



25 



30 



35 



Trivial explanations may account for the lack 
of viability of these intertypic chimeras. First, the 
two silent mutations introduced in the genotype la 
backbone (one in p7 and one in NS5B) for cloning 
purposes could potentially eliminate infectivity. This 
is, however, very unlikely since mutations at these 
positions exist among field isolates of HCV including 
strain HC-J6ch (Bukh et al., 1998). Also, it is 
noteworthy that the three previously published 
infectious clones of strain H77 had numerous silent 
nucleotide differences (Hong et al., 1999; Kolykhalov et 
al., 1997; Yanagi et al., 1997). Second, signal 
peptidases might' not cleave the chimeric E2/p7 or p7/NS2 
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junction. This seems unlikely, however, since 

o 

eukaryotic signal peptidases typically recognize the 
amino acid sequences upstream of the cleavage site [the 
(-3, -1) rule] (Nielsen et al., 1997) and the amino 
acids at these two sites are conserved between genotypes 
5 la and 2a (Fig. 5B) . Finally, the E2/p7 and/or p7/NS2 
gene junctions could differ between genotypes la and 2a. 
The junctions determined for genotypes la and lb were 
used (Lin et al., 1994; Mizushiraa et al., 1994; Selby et 
al., 1994) because those for genotype 2a have not been 
identified. In the latter two cases, further analyses 
of genotype 2a should eventually provide sufficient data 
to overcome such potential problems and it would most 
likely be possible to construct a viable chimera. 
15 More complicated explanations for the lack of 

viability of the chimeras might be required if critical 
genotype-specific interactions occur as regards the 
structural proteins, the nonstructural proteins and the 
2Q genomic RNA. For instance, one cannot rule out that the 
chimeras were not viable because the IRES function was 
compromised. In in vitro studies the IRES activity 
depended on RNA sequences not only in the 5' UTR but 
also extending 3' of the translation initiation site 
25 (Hahm et al., 1998; Lemon and Honda, 1997; Reynolds et 
al., 1995). Although the 3' border of the HCV IRES is 
still controversial it is believed to involve at most 
the first 39 nts of the core gene (Lemon and Honda, 
3Q 1997). The 5' UTR of the intertypic chimeras was either 
a chimera of genotype la and 2a sequences or the entire 
5' UTR was derived from the la clone (Figs. 4, 5A) . 
Importantly, the 5' end of core is conserved among 
genotypes la and 2a (Fig. 5A) . Thus, the predicted 

35 
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IRES-like secondary structure is maintained in these 
chimeras, suggesting that the IRES activity most likely 
was maintained. 

Possible interactions between the structural 
5 proteins and the nonstructural proteins and/or the 

genomic RNA, which involve RNA packaging, replication or 
translation are conceivable. In poliovirus, which is 
another .positive-sense RNA virus, functional coupling of 
RNA packaging to RNA replication and of RNA replication 

10 

to translation have been suggested (Novak and 
Kirkegaard, 1994 ; Nugent et al., ,1999), Similar to 
other viruses of the Flaviviridae family, a membrane- 
associated replicase complex is thought to initiate 
15 replication at the 3' end of HCV and to synthesize a 
complementary negative-strand RNA (Rice, 1995) . The 
putative cis-acting elements at the 5' and 3' termini 
which are believed to be important for viral genome 
replication (Rice 1996; Frolov et al., 1998) should be 

20 

maintained in the intertypic HCV chimeras at" least in 
the two constructs with the authentic la 5'UTR. 
However, it is conceivable that the viral packaging 
system was interrupted (Frolov et al., 1998). Studies 
25 using a Kunjin flavivirus replicon system and providing 
the structural proteins in trans suggested that the 
essential encapsidation signals did not reside in the 
structural region of the genome (Khromykh et al., 1997, 
1998) . The location of the packaging signals of HCV is 
not known. However, if the structural proteins 
encapsidate viral RNA via genotype-specific sequences 
outside of the structural region, the chimeras would be 
unable to package the RNA and it might be extremely 
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difficult to construct viable chimeras between highly 
divergent strains. 

Example 3 

5 A consensus molecular clone of 

genotype 2a is infectious in vivo 

In order to prove that the genotype 2a portion 
used in the 4 intertypic chimeric cDNA clones indeed 
represented the infectious sequence, a consensus full- 

10 length cDNA clone of HC-J6ch (pJ6CF) was constructed. 
The core sequence of the T7 promoter, a 5' guanosine 
residue and the full-length sequence of HC-J6ch (9711 
nts) were cloned into pGEM-9Zf vector using Notl/Xbal 

j2 sites. Within the HCV sequence there were no deduced 

amino acid differences and only 4 nucleotide differences 
(at nucleotide positions 1822, 5494, 9247 and 9289) from 
the consensus sequence of HC-J6ch as determined in the 
present study. The silent mutation at position 1822 was 

20 

within the structural region and so was also- present, in 
the four intertypic chimeras. The 5' terminal 16 nts 
and the 3' terminal 82 nts were deduced from previously 
published HCV genotype 2a sequences (Okamoto et al., 
25 1991, Tanaka et al., 1996). The full-length cDNA clone 
of genotype 2a contained a 5' UTR of 340 nts, an ORF of 
9099 nts encoding 3033 amino acids and a 3' UTR 
consisting of a variable region of 39 nts followed by a 
132 nucleotide-long polypyrimidine tract interrupted 

30 

wxth 3 A residues and the 3' terminal conserved region 
of 98 nts. 

RNA transcripts from pJ6CF were injected into 
the same chimpanzee used for injection of the 4 
35 intertypic chimeras. The chimpanzee became infected at 
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the first attempt with an HCV titer of 10^ GE/ml at week 

1 post inoculation (p.i.), and 10^-10^ GE/ml during weeks 

2 to 6 p.i. The consensus sequence of PGR products of 
the complete ORF, amplified from serum obtained during 

5 week 5 p.i., was identical to the sequence of pJ6CF and 
there was no evidence of a quasispecies . Since RNA 
transcripts of this infectious genotype 2a clone were 
infectious in vivo, and it shared an exact sequence with 
the non-infectious intertypic chimeric clones, their 
failure to replxcate must have been the result of 
incompatibilities between 'the genotype la and 2a 
sequences. 

To confirm that the chimpanzee used was 
15 susceptible also to infection by genotype la, which' 
comprised most of the intertypic chimeras, the 
chimpanzee was subsequently inoculated with RNA 
transcripts from the infectious genotype la clone 
{pCV-H77C) . Serum samples were tested in an 

20 

H77-specific RT-PCR assay to identify super-infection 
with genotype la. At week 1 p.i. the total HCV genome 
titer was 10^ GE/ml and the H77-specific (la) genome 
titer was 10^ GE/ml. The H77-specific genome titer 

25 increased to 10^ GE/ml at week 2 p.i., and reached 10^ 
GE/ml during weeks 3-6 p.i. The consensus sequence of 
PGR products amplified with H7'7-specif ic primers at 
weeks 1-6 p.i. were found to be identical to that of 

2Q pCV-H77C. However, the direct sequences of PGR products 
amplified with the 5' UTR primers at weeks 1-2 after 
inoculation of pCV-H77C were identical to that of pJ6CF 
indicating that the 2a genotype was still present and 
represented the majority species. These experiments 

35 confirmed that the inability of the intertypic la, 2a 
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cDNA clones to infect the chimpanzee was not the result 
of protective immune responses in the chimpanzee but 
represented deficiencies intrinsic to the chimeras. 

^ Discussion 

The published infectious cDNA clones of HCV 
represent the two most important subtypes of genotype 1 
(Hong et al., 1999; Kolykhalov at al., 1997; Yanagi et 
ai., 1997, 1998). However, 5 more major genotypes of 
^® HCV are recognized. In the above Examples, the 

infectivity of a cDNA clone of a second major HCV 
genotype was demonstrated. As in previous studies, the 
infectivity of RNA transcripts was demonstrated in vivo 
by intrahepatic transfection of a chimpanzee. This new 
infectious clone (pJ6CF) encodes the consensus 
polyprotein of HCV strain HC-J6ch, genotype 2a. Its 
encoded polyprotein differs from those of the infectious 
clones of genotypes la and lb by approximately 30% 

20 

(Table 2) . Genotype 2 strains, in particular subtypes 
2a and 2b, have a worldwide distribution and important 
differences between genotypes 1 and 2 with respect to 
pathogenesis and treatment were indicated in previous 
25 studies. The availability of an infectious clone 
representing a second major genotype of HCV should 
permit new ways of studying the molecular biology and 
immunopathology of this important and genetically quite 
different human pathogen. 

30 

The 5' and 3' UTRs of HCV are believed to be 
critical for viral replication, translation and viral 
packaging (Rice, 1996). The 5' 203 terminal nucleotides 
and the 3' 101 terminal nucleotides of the published 
35 infectious clones of genotypes la and lb were identical. 
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However, the sequences of UTRs of the genotype 2a clone 

O 

differ from those of the genotype 1 clones. Overall, 
the 5' UTR of the genotype 2a clone has 17 nt 
differences and a single nucleotide deletion compared 
with the infectious clones of genotype la (Fig. 5A) . 
5 Five of these differences and the deletion are within 
the first 30 nucleotides, whereas the remainder are 
found within the predicted IRES structure. Differences 
also exist between the 3' UTR of the genotype 2a clone 
and the clones of genotype la (Fig. 5B) . The sequences 
of the variable region are very different. Recent study 
has shown this region is not critical for infectivity in 
vivo (Yanagi et al., 1999). Within the regions which 
are critical for infectivity in vivo (Yanagi et al., 

15 1999), the 132 nucleotide-long polypyrimidine tract of 
the genotype 2a clone has 3 unique A residues 
interspersed and the 3' terminal conserved region of 98 
nts has 4 nt differences within the 3' terminal stable 
stem-loop structure (Fig. 5B) (Kolykhalov et al., 1996; 
Tanaka et al., 1996). Since the 2a clone was infectious 
these sequence differences are apparently real and are 
compatible with infectivity. Further studies are 
required to determine whether these represent critical 

25 genotype-specific sequences. 
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□ 

WHAT IS CLAIMED IS: 

1. A purified and isolated nucleic acid 
molecule which encodes human hepatitis C virus of 

2 genotype 2a, said molecule capable of expressing said 
virus when transfected into cells. 

2. The nucleic acid molecule of claim 1, 
wherein said molecule encodes the amino acid sequence of 

10 SEQ ID NO: 2. 

3. The nucleic acid molecule of claim 2, 
wherein said molecule comprises the nucleic acid 
sequence of SEQ ID N0:1. 

4 . A DNA construct comprising a nucleic acid 
molecule according to claim 1. 

5. A DNA construct comprising a nucleic acid 
molecule according to claim 3. 

6. An RNA transcript of the DNA construct of 

claim 4. 

7. An RNA transcript of the DNA construct of 

claim 5. 

8. A cell transfected with the DNA construct 
of claim 4. 

2® 9. A cell transfected with the DNA construct 

of claim 5. 

10. A cell transfected with RNA transcript of 

claim 6. 

35 
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-se- 
ll. A cell transfected with RNA transcript of 



claim 7, 



12. A hepatitis C virus polypeptide produced 
^ by the cell of claims 8 or 9. 

13. A hepatitis C virus polypeptide produced 
by the cell of claims 10 or 11. 

14. A hepatitis C virus produced by the cell 
10 of claims 8 or 9. 

15. A hepatitis C virus produced by the cell 
of claims 10 or 11.- 

15 16. A hepatitis C virus whose genome 

comprises a nucleic acid molecule, according to claim 1. 

17. A hepatitis C virus whose genome 
comprises a nucleic acid molecule according to claim 3. 

20 

18. A method for producing a hepatitis C 
virus comprising transfecting a host cell with the RNA 
transcript of claims 6 or 7. 

25 19. A polypeptide encoded by a nucleic acid 

sequence according to claim 1. 

20. A polypeptide encoded by a nucleic acid 
sequence according to claim 3. 



21. The polypeptide of claim 19, wherein said 
polypeptide is selected from the group consisting of NS3 
protease. El protein, E2 protein or NS4 protein. 



35 
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22. The polypeptide of claim 20, wherein said 
polypeptide is selected from the group consisting of NS3 
protease. El protein, E2 protein or NS4 protein. 

^ 23. A method for assaying candidate antiviral 

agents for activity against HCV, comprising: 

a) exposing a cell containing the hepatitis 
C virus of claims 16 or 17 to the 
candidate antiviral agent; and 
10 b) measuring the presence or absence of 

hepatitis C virus replication in the cell 
of step (a) . 

24. The method of claim 23, wherein said 

15 replication in step (b) is measured by at least one of 

the following: negative strand RT-PCR, quantitative RT- 
PCR, Western blot, immunof luoresence, or infectivity in 
a susceptible animal. 

20 

25. A method for assaying candidate antiviral 
agents for activity against HCV, comprising: 

a) exposing an HCV protease encoded by a 
nucleic acid sequence according to claims 

25 1 or 3 or a fragment thereof to the 

candidate antiviral agent in the presence 
of a protease substrate; and 

b) measuring the protease activity of said 
protease. 



30 



35 



26. The method of claim 25, wherein said HCV 
protease is selected from the group consisting of an NS3 
domain protease, an NS3-NS4A fusion polypeptide, or an 
NS2-NS3 protease. 
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27. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 23. 

28. An antiviral agent identified as having 
antiviral activity for HCV by the method of claim 25. 

29. Antibody to the polypeptide of claim 19. 

30. Antibody to the polypeptide of claim 20, 

31. Antibody to the hepatitis C virus of 

claim 16. 

32. Antibody to the hepatitis C virus of 

claim' 17. 



33. A method for determining the 
susceptibility of cells in vitro to support HCV 
infection, comprising the steps of: 

a) growing animal c^lls in vitro; 
20 b) trartsfecting into said cells the nucleic 

acid of claim 1; and 
c) determining if said cells show indicia of 
HCV replication. 

25 34. The method according to claim 33, wherein 

said cells are human cells. 

35. A composition comprising a polypeptide of 
claim 19 suspended in a suitable amount of a 

30 

pharmaceutically acceptable diluent or excipient. 

36. A composition comprising a polypeptide of 
claim 20 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 

35 
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37. A composition comprising a nucleic acid 
molecule of claim 1 suspended in a suitable amount of a 
pharmaceutically acceptable diluent or excipient. 
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QC)C»GaCXXC TCKKm9X G?C?OC^ 50 

GSAACEftCIG "iCnCAOaCA GAAAGCXSICT M3CCKE3XS3 mGDOGftG 100 

TO i CQ lG aG OCrocaGGftC OOCDX'ICQC GQSaGaGOCA TOG lOJiL 'iG 150 

CXaSAftOaOGT GftGraCAOOS GAftTIQOCftG GMGAOOaOG lOlTi'lLTm 200 

GAIAAflOOOG CICAATOCCT QGfiGMTIOS a m UOGGGL ' GC3«GfiCraC 250 

TRGCXXSAGm GIGEmSIC QOSWmr TTCIQGERCr <3CCIXSma3 300 

GIQCnGOGA. GIQCO0a333 AOSTCICOm GMJDSUGCMZ CSOGAQCaaS 350 

AASGCIT^AAC CICAAftGAAA AftOCAAAOSr AftCaaaAOC GIUUUUCflCR. 400 

GGftoncRftG 'iimjcmcG GoasicaGftT oariGaiG aft . GiTiiALTiur 450 

TOOC3Qa9C3^ Q3GOXTAGA TIQ33IGIQC Q0303ftO3fiG GAAGfiCTTOC 500 

GAGQQGTOaC AAOITOSOS TAGftOSICftG (DdJOmXA M33CM33H0G 550 

QOCXDGftGQX AGGAlX'lCaS CICAGCOCXK GEftLULTlOa CXXXTCliOG 600 

GCAMGftGSS Tn30333IQ3 Ga333AaQX TOCIGTCICC CXX3IQ3CICr 650 

CGGUCEAQCr GGIQ3XQCAC AGftOOC3C3033 CGIM3GI03C QC3ysaTIQC33 700 

TS^AGSiCATC GAmrcrm cu iu uu j L' iT c3Qcn»cnc AiGaagjacft 750- 

mxDacTOGT aaoooaoQCT cnQ3fta3aG cigocagoqc cxnQ3CX3CKr 800 

G303IOCX3Q3 TICIQGAAG^ Ca30GfIGAAC TMQCAACfiG GGAAOCITOC 850 

TOGTIQCICr TiClCTATCT T0CTICIG3C CL'iGC'iC'lCi' TQCJCIGACIG 900 

iQoaaocnc AQOcraam GiQOQCAArr ccroQ933cr TiJiOCAroic 950 

ADCAAIGATT GOOCIWOC GAGIKITSIG Tft0GPO30l3S CSOGMGaCAT 1000 

cxTOGACAcr c3gc33asiQro TOocnaajr TOQOGfiosGfr AAasociosA loso 

UJiUriUJJ r GXCGIGACC OTAGGGJiGG} CCACCM33GA CC33CAAACIC UOO 

CXDCftCAADQC AQCnOGAOG TCAaMOSAT CIQCTIGIOG G3ODQ0CftC 1150 

OCiCiUC'iClG QCX:Ci"CIftD3 TOQQOGftOir GfIGC033TCr GICmcnG 1200 

iroaiCAACr GflTEftlXnC TCIOXAGQC QQCACTOGAC GfiOSCAAGAC 1250 

TOCAATIGfIT CmClMOC OOaOCSOKIA AaSSSICRTC GSOOaCRTC 1300 

G3An3araAiG MGAftciQsr cxx3aaa33c imnrosiG GfERoarcaQC 1350 

TQCIO0C3GAT OCJCRCAfiQX ATCfiaOGftCA IGPaUL'iCG TOCICftCIQS 1400 

GSftGIDCIGS CJQQGCftlAGC UimTlL'iUC ATO3I0333A 1450 

GGflGCIOSIA GIQCIQCTOC TKlTiUUULJLi CGICGROaCXS GAAACXTftGG 1500 

icACoaa333 AAATOcxnsc cxacftGoms c iu j XTm r Tosicicnr 1550: 

PCMXPOXJS CXaAOCaGAA CAaXrWOG MCAflCftCCA AOOacaGTIG 1600 

GCACASCAAT AGCADQ3XT TGAATIGCAA TCAA^aXTT AACWX09CT 1650 

OSTEftQCAQS QCICTICIftT CAACftCAAKT TCAACICITC W33CIGKX.T 1700 

GAGAOGfTIQa OCaGGTOOGG AOOOCraO: GAli'i'i'iUUUL" AG33CT33Q3 1750 

TCXnTOCPiGT TMOXAftOG GAAGOOQacr OGACGAWDX OOCraCIQCT 1800 

oacftcpm: TCCAftCfiOCT 'iGiGGe a rro tgqgogcaaa Gftoogroror isso 

G300033IAT ATIQCTICAC TOOCfiOODOC GTQCJIQ3I03 GMOSMXXSh 1900 

FIG. 6A 
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CAQGfI0333C QOQCCIMCT ACAQCTO333 TOCAAAIGRT A033 A !D3iCi' 1950 

TOSIOCmA CaftCAOMS CCAOOXTOS GCAATiaSIT aaanUilft OC 2000 

•n^MG^OT CAftCIOGRTT CR03AAGfIG TXXJ33 M 303C OJULTiUiUr 2050 

Cftia3SAQ33 GIQ93CAACA ACAOCI'lUCT CIQ33C3CACr GMTIQCTiaC 2100 

GCAAACAICC GSAftQDCACA TftCICTOJjf G333CIO033 ICCCflGGATr 2150 

ioms^Qsr Gcarosiasv cimmcAT A o ju mu jc: /osmjcnG 2200 

•ERCCftlCAAT •EOOGMar TCAAAGICaG GMGimSIG G3A033Gia3 2250 

AQCACAQ3CT QGAAQaQOOC 1QCAACIQGA 03J3333J3li ACQCIGIGKP 2300 

CI13GAAGACA CXaOCICfia: CX3GfnOCTOC TCKraOCRC 2350 

ACflGIl33CAG GiUJi'lULbT GTiUlTlCAC GAODinQOCA G GCTIUI UC A . 2400 

OOQQXTCM' OCftOCiraC CftGAACATTO lOSAOSIQCA GEftC TiUm : 2450 

GQ03i3m3r CAAGCKroa: GiLciiiooc ATiAAGioas AG' x aaji mr 2500 

ICI CC'lGi 'iC CnUHJUn U CftGftQ333CX3 QGICXGCIU: TGCTIGiaGR. 2550 

TSAromcr CAOsacxrAA GooGAosacss cmaaGAA ociasEwaA 2600 

OCAATOCAG CATCCXnOX 033Gft03CAC GJIUi'lUlGi ' OJl'lUJiUm' ' 2650 

GircnUIGC 'iTiULUiUJr AICIGAAGGG m33fIG3GfIG GC3a33A0333 2700 

icEACQcrcr CEftoaosoG toqocicioc iccux-'iUL T Q : :! i Qxu i'm 2750 

CXnCftGC]GG3 CATftOgCACr GSftCftOaaG GIOOOCXSQGfr CG' IGI UUUUG 2800 

. OGflTCTICIT GTOOGSnaA T330QCTCAC TCIGICGOA IRTIRCAftQC 2850 

GCEftmiCAG CIGGIQCMG TQ3I0XTIC ASIATmCT CaCCRGAGIA 2900 

GAA033CAAC TOCAC33IGIG GUi'lLUJLOC: CICAACGIDC 0333330333 2950 

0Gftm:cGiu imrmciai iGiGiGiacr ACftaxxsAo: cicm i ftii ' i'm 3000 

ACA TCftOCAA . ACBODCIG GCJCAICITOS GA OGLL'i ' i'lG GATICTICAA 3050 

QOCAGlTiGC TEAAAGIOr CDOTCGIG CJGmi'l C AAG GGCI'IUIUUL* 3100 

(»ICIG3C9CG CEAG0a033A AGA330DC333 AOSICAITOC GIQC3\AAaQS 3150 

OCAICATCAA. GTDmaaOS CmCI03C3V COMGIGm TAAOZAaCIC 3200 

ALCCC ILTIL' GAGACIG33Z 02ACAA033C CIQ03AGAIC IGSOQGIGGC 3250 

IGIQS^ACCA. GTOGTCncr CXXGAAIQ3A GAOS^AaZTC MCJOSKSSS 3300 

Q33GAGA3aC O30CQ03IQC QSIGACATCA TCAA UUJLTr ULOJGiUiC r 3350 

QCCa3EAG33 G0CAO3AGAT A CiULTiUUG CX^OCJCGAOS GAAIOGflCIC 3400 

CAAG333IG3 AG Gi'iUJi Ui C333XATCAC Q3QSIAQ33C CftQZAGAOGA. 3450 

CSAGSX'IOCT AQSGr'lGlMA ATCAaaGOC TGfiCIG3Cn3 QGftCAAAAflC 3500 

CAAGIG3A03 GIGAQ3I0CA GATOSIGKA /CTOCTAOr AAALXJ I ' l UL T 3550 

QXAAOGIGC AICAAaa303 TA3QCTO3AC TGICIftDCAC GGQ3QaGGAA. 3600 

C33AG3ADCAT CGCAICAOX AAGGGiUL'lG 1CATOC»GAT GISaaVGCAAT 3650 

GIQGACCAAG AOJi'iUlUJLJ ClGOXajLT CXHCAAGSIT CJCDX TC Arr 3700 

GACAbOCIGT ADCIGOOaCT OCTOSSACCT ' m CCiUjiL' AOGfiQ3CaC3G 3750 

C03A3GICAT TCOIGTOCOJ OQSDSAQGTO ATROM333 TAQOCIOCTr 3800 

FIG. 6B 
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TOGOODCXjgC CmTlUJJA CriGAAAQX TOna33333 GDXQJIUIT 3850 

GIQaOXQCG G31CM33CCG 1V333X'iMT CflGC39303CX3 GIGIGOm: 3900 

GB33M3T33Z IT^AMmSIG GALTi'IMUJ CIQTOSAGAA CCI3«333ACA 3950 

ACCMGftGAT OXOSOlUiT CACJ33ACARC lOCnoCG ft C CAQCAGflCn: 4000 

CCAGAGCnC CftQGr'iGQax: A0CIQ2ftTOC 1CXXaCD33C AQCXaSEAAGA 4050 

osmfiGsr oaoQXTOCG isoacRaxc AaaacERCAA. G gKfrnijm 4ioo 

cicftflODCJcr croriGCTOc AAcxscnm: ' muLfiuLTr ACAroixrAA 4150 

Q300CMO33 GITGftTCCm ATRTCAOSAC OSSX'XGAGA. ACAATIftaav 4200 

CIOTAGCir CATCftOGOaC ICXaOCIMG QCS^GITOCT 1O0C33fta33: 4250 

GGG'IUL'ICAG GAQSiGCTiA IGACATWOA MTIGIGA03 AGIQOCftCIC 4300 

CAD3GA3Q0C ACATOCATCT TS33CMa33 CA L ' iUlUJiT GftCTRftOCfiG 4350 

AGACIG033S QQCXaG?OG GTIGIQCICG (XftCIQC33«: CXXTOCOSaC 4400 

TOOSICACIG TGiCUJATOC TAftCKinSC Gft03ITQCTC TGTOCftCXaC 4450/ 

CXSGACaCMOC CXCTrmOS QCAMGCIM' OOOOCTOGAG GIGATCAAOS 4500 

G3QGAAGACA TCICML'i'iC TGDCACICAA ASSJi^GAAGIG CXSAOSAQCnC 4550; 

QCXI303AAOC: TGC?IGOCmT 033CA3CAAT QaCGIQaOCT AC33U3a3C333 4600 

ULTiUAOGfIG TCIGICATOC (DGftCJCftG33S CXSATOTIGIC G I C GHji OA 4650 

OOGAIQCICr CATCACr33C TmDCX3Q0G ACTIOGACIC TCIGKmSAC 4700 

IQCAACAOST GIGICACICA GACAGTOSKT TTCA03CTIG AX C IftCCTi' 4750 

TACXanCAG ACAA0CfiO3C lOXDCAQGA TOCIGICIOC A33ACICAAC 4800 

OOOmSCAG GACIG3CAQG 033AAOC3CM QCMCnoaG MTIGIOXA 4850 

O0Q333GAQC GDO0CK3O33 OOOITOGAC TOGnm?nX TCIGIGAGIG 4900 

CIMGR0333 QQCIGIQ3T OGIMGAQCT CAO3aQ03C3C GfiGACEfOG 4950 

TEAG3n»03 AQCCnmCATC AACAOOQCQS OJLTiUULUr GIOaCAQGAC 5000 

CAICITCAAT TnGQGfiGQS (JUiLTi'lM QOaCICRCIC AlAmGKIGC 5050 

OCfiCnTITA TOOCAGACAA QGAGAACTIT C]Cri»DCIQ3 5100 

imJICMJOk AOXAQOG'iG 100X^333 CICAM0C30C TOXOC K ia S . 5150 

IGQSfiOCftg^ IGIQGft^^GIG TITGftliLUX: CirftftAOrft. CGCIUC ft iiUi 5200 

OOCWODX CroCEftTACA GACIGGOGGC aUi'iU CAftT GftAGICRGOC 5250; 

1GA0C3CA0CX: AAICADCStfA maTCATCA CATOCMGIC GGGGGAGCIG 5300- 

GAQSTOGICA 03AQCACC1G (J JiUJiUUiT Q303QCX3roC TGaCIGCICT 5350 

GQOGQCJGflAT IQCXJiUlUAA CAQ9:nQ33r QGflCKERGIG 03303003 5400 

'IL'i'iUlLUJLj GAftGOQGO J ^ ATIMaOCIG PCMISSajr 'lUiCi^ft OCaG 5450 

GAGnOGMG A3AIQ3AAGA GIQL'iClCPG C»Cn30CT ACATOSMXA 5500 

AQQGR3GA3G CTOGCIGAGC AGITCAAOCS^ GAW3330CIC GGGCiOCiQC: 5550 

AGftOOQOGnr CXDOOCKTOCA. GAOGfTnOCA COOClUL'iG T CDCAGAOa^C 5600 

TOCTAGAAAC TOGA03ICIT 'i'iGCJX G MG CPCATOIOoA ATITCATCAG 5650 

TQ03ATACAA TACnO303G GCJCIGICAW: OCTOJCIGgr AACCOTCXa. 5700 

FIG. 6C 
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TlGLTiOOT GKiUyLTm' ACAQCTOCX33 TCADCaOCXr ACIMacaCT 5750 

GSOCS^AAOCX: TOCTCTTCAA CAIATIG333 G3GfIG33IG3 CLKkiJJUJV 5800 

oacxi mjjc GSFiocaaciA ciux'mur ojijiuLiuj :: CEfiogioajs 5850 

cxxxngasj cftGC33na3?i 010333^^03 lacimiaaft, cft m.T iuLA 5900 

OJG'mUJUU 09Q9CJ3IQ3C QOSftQZTCIT GEftOCftTICA AGATCMGftG 5950 

O33roft03ic anucA033 Aosftociosr caaiciocig cmsxRoxr eooo 

TCIOrJCIGS AQOCJCl'lG'iA 0109310103 TCIQ03CW3C AATRCIG33C . 6050 

a33ZA03nG G3CXI33303A G3333CAG?IG CAAIOSftTCA AOD33CiaAT 6100 

AQ0CTID30C 103333333^ ADCMOITIC CXXDCROTaC TOQ3 I UJLU3 6150 

AGfiGCGATOC A300303D3C: GTCACIGGCA ISCICfiQCaG CXHCRCIGEA 6200 

AXCAGnOC 1GftG3QG?CT QCKICfiGIGS MS^AQCTOSS AGIGUyXftC 6250 

TOCATOCroC Gi'i'lUL'iUJC TAAOSGACftT CI03GACIG3 MSOQCXSfiGS 6300 

lOCTCAGQGA CITiaAGftCX: TOX'lGAAftG OCAAQCICAT G3CAC3«CIG 6350 

GCI033ATIC OCTTTdlGIC CIG0CAG33Z 0331303033 Q33ICI0G)UL ; 6400 

AGSOyDGQC AT]3m3CACA CI03CIOC3CA CIGIOSfiGCT GAGATCACIG 6900 

GACA30ICAA AAAOSSSACG ATCAGSAIOS TOSSIDCI^G araX3CaQ3 6950 

AACATGIOGA GIOSSftOSTT CC]CCftn3^ GOCTACACJCA O33330C3Cro 6550 

TACroOOCTT CCIOCOaDGA ACEAaaAGIT 0303^3103 A033IOICIG 6600 

CAGAGGftAm 03IQGAGAIA A03CX333I03 GSSACTICCA CmXSmrOJ 6650 

OSIMGACm CIGACAAOCT TAAAIQDD03 lODCAGATOC CMOSCXXXSA 6700 

AnmCACA GAATIOGACG OSGIQQQaCr ACACAQSTIT OJXU-UL'iT 6750 

O2AAG30CTr GCroOSQSAG GAQSEAaxaT 1CAGAGIA03 ACTOCftOGAG 6800 

TACO03GIO3 OSIOGCAATT ACCTIQOGM OaOGAAODSS AOgiaGOGGr 6850 

OnOAOGfroC MOCICACIG AianCOCA laaS^ACAGCA GAGGOasaO S 6900 

G3AGAAQ3IT OGOSAGAGSG TCAiXUJL'iT CIAaOSXAG CTaCiajJtJi' 6950 

AQCJCAQCIOr CX33CICCATC ICICAAGSCS^ ACITOZADCXS OCAADCAIGA 7000 

CIDCXnOAC G003AOCICA. 13003:^3^ 0CiaCIGIQ3 A332RGSAGA 7050 

1Q3G093C3VA CA3CA0C»Q3 OnOAOrcSG AGAACAAAGT GSIGATICIG 7100 

GALlLLTiUi ATOOQCnOr 03C»GA03AG GAIGAOGS33 AOSICnXJST 7150 

AOCIGGAGAA M lCiU LU J A. AGICTOGGAG A:i'imXXI33 GOOCIGCODS 7200 

TCimaoGGa ooooomzipc AAarxxooc tagtsgagac giqsaaaaag 7250 

GCIGACTAOS AAOCAOCIOr QGnX]CA3G3C: lODOOSCiaC CPCCTCCPiXi 7300 

GICCCCIOCT GIQQCIODSC: CTQQSAAAAA. G333a033IG GiCUIOtfjOS 7350 

AAICAADOCr A3CEACIG0C TiOXU G AQC TIOOCAOCAA AAbTiTlUX' 7400 

AQCTOCnCAA CrroOOa^AT TAOmDSAC AAIAOSACAA CKlLC'iClGA. 7450 

QQCD33XCT TCia3CIOC3C CXTCOGACIC CGAOSnGAG TOCISTICIT 7500 

OCATOCOXX: 0CIQGAQ03G GAQCCIU333 AIGCOSAaCT CAOC3GA033G 7550 

TCATOSTOGA CGGflCAGTCAG lOSOGOOSAC ADQSAAGATO lUj'iUiUClG 7600 

FIG. 6D 
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ciCAMGicr amicciasA. cposoxjct cxsicaccxxxs TOOocroaQs 7650 

AAGAftCAAAA ACIQOaCKrC AA03CACIG\ QCAAClLUiT QCI3V03CXaT 7700 

OOftlCIGG 'lUlM'lUJAC CACnCAOX: A OIOl ' l ' IUUL ' AAfi03CAI3AA 7750 

G?^AAGfICACA TTIGACAGAC laCSVAGTICT OSftCfiOXKT TZm^GGAOS 7800 

TOCICRAQ3A. QGflCftAftQCA Ga3GOGIC3\A AAGIGAftOSC TRACTiULm 7850 

TOCGISGftGG AAQLTiWJAG OCItSmir CX3CAncaG CXawaOCaA 7900 

GTITOQCiaT G3332AAAfiG ALUiULUilU CCKrOXSGR AflG3CXI3raG 7950 

CXX^CRTCftA. CroOSIGIQS AAAGftOCTIC 1G3«GfiCaG TCUVfiCRlXA 8000 

ATAGACRCm CXSOCftlQQC CAAG?^A03fiG (Ji ' i'i'iUlU JG TIC m jL'ilA 8050 

GA^m3333r OGfmAQOCAG CTOSICICftT OGIGTIDOOC GftDCIQ33GG 8100 

TOOGOSTCiG CX3AGAAGMG GCXXTOimS A03IG3mG CAftGCTOQC 8150 

dOQOOGflGA 1Q3GAAQCIC CmXSGATIC CAAISOCAC CAQGACAQOG 8200 

QoTIGRft TIC CraSIGCAAG (DSIOSAAGfIC CAAGAftGACC CDOaMQaOGT 8250 

ICiajDOGA mXmiTGr TnCACnXA CAGICACIGA GfiQC3GACATC 8300i 

CX3By3QGAGG AQSCAATTIA CCAATOnGT GACCIQ3RGC COCAAGOOCXS 8350= 

CGIQQOCATC AAGTCCCICA CIGAGAG9CT TimUi'iUCaS Q3 XCiLTiA 8400 

CXS^ATICAAG G3G33AAAAC TOQGSZmOC GC^OGflGOOS CQCGftGCJOaC 8450 

GDOGACAA CEAQCIGIQG TAACAIXCIC ALTlUL'l^ACA. TCAAQCXXXE 8500 

ascAODcror oGAOoaacaG QQanoGflOGA ciQcaocATC ci ccjiuiui u 8550 

G03A0SOT AG'iUUi'lMC TGIGAAAGIG 033333roCA QGAOSAOaOG 8600 

OCX3C0CTCA GAQC3CITCAC QGAGQClSmS AOCAQSDO' COQCO-CaJL' 8650 

Q933GA0aCX: CJ2ACAA0CAG AATADSOT GSAQCTEAm ACATCAIQCT 8700 

OCroCAAOGT GICAGTCQOC CAOGADQSDS CIQGAAAGRG 03ICI3RCIAC 8750 

CimX]03IG AOCTCAAC aXXXTO30S AGftgDGGJG T QQGAGACfiGC 8800 

. AAGACACACr OCAlSICAATT 0CTQ3CERQ3 CAACKD^ATC ftiUlTiUJUL ' 8850 

aSOCTGIG GCSOGAGGAIG ATACIGMGA OOCATnCIT TAODGIOCIC 8900 

ATAOrAQGG ATCAQCTIGA ACAQQCICIT AACIGIGfiGA TCI303»GC 8950 

crocTAcroc AmsAAocM: tosoceaoc TOCAAaxsar csvaagrcidc 9000 

ATOQOCTCAG 03CATrnCA ClOaOGIT ACIC'ICCftGS IGAAATCAAT 9050 

AG33IQ3QaG CAaOCCICfiG AAAACITOQS GI OJUJUULT TaOGft ULTiU 9100: 

GAGACADOQG QOCX33GfiQCX3 TOOaOQCERG aji'iL ' iUlLU AGAOSAOaCA 9150: 

u^'iLLijAT AIGIG3CAAG TAOCICTICA ACIG332RGT AAGAACAAAG 9200 

CICAAACICA CIOCAAnaOC Q9aCJXTO3C GGX'IOiaCT 'iUICmJl ' lU 9250 

GTiamSCT GQCBOQOG G333fiGfiCRT TIMCACAQC GIGICICAIG 9300 

CD333C]OOCX3 CTOGTICIOS Ti ' i'iUJL ' Eft C ICCBX'iajC T3CAGQQGIA 9350 

OQCAi ciaa: tcctxiutaa cxxsatcaaqg tig333iaaa CACi a mx 9400 

ICmAOOCA Ti'iLL'iUi'i'i' ' iTi'i'i ' i'lTi ' i ' 'i ' iTi ' i'ri ' i'iT ' iTi ' i'iUi ' i'i ' i' 9450 

TnTmcrr TocrnDcrr ci ' i ' i ' i ' mu: ' I ' i ' iL ' i ' ri ' i ' iL ' cxrncrnAA 9500 
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IGGinxnJCC ATCmOOCC mSTCMJSJ: TAQCIGIGAA AGGTOOSIGA 9550 
Q009Cft3GRC TQCAGAGAGT Q3G?ffiACIG GCCiClL'lUL' ASftTCfilGT 9599 
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M5INPKPQRK TKFNINRRPQ WifFPSXSQI V03VXILERR GEKIiSTRMR SO 

KTSERSQmS RRQPIEKAHR PETSIV^IG YIWELYGNEB a3»«OCLSP 100 

EGSRPSWGPT DERFRSRNLS KVIDILTOSF AECMSnPLV GAEDOSAAPA 150 

IAH3VPWLED CWNBOGNLP GCSFSXEIIA LLflCLlVPAS A3iQVRNS9GL 200 

WIMX3NS SIVYEAADAI IHTEOTVPCV RECNASRCW7 AVrPIVZORD 250 

GKLPTIQLRR HldEIMSSAT LCSALYVGOL OGSVEE^L FIFSERRHWr 300 

TQDCNCSIYP GarTOHFMRW IM4MNWSPIA ALWAQEIIU: PQMMCMrfiG 350 

AHWSVUiGIA YRSMWa^^RK VLWUIEAG VEftEIHVTOS NRGRTmCLV 400 

GLLTPGRWaN IQLININ3SW HINSEMJNCN ESINIQWLA3 IFXQHKFNSS 450 

GCPEHLASCR PLIEK^QGWG PISY?^N393L lERPiOfflDfP ERPCX3IVPAK 500 

svaspvrar pspwvstid rsgapiyswg andiewfvln NiRppoagwF 550 

GCIV^INSIGF TKVCCAPPCV IQGVGNNIIiL CPiUUt'HKHP EftlYSROSSG 600 

IWnnOM) yPYRUWHyPC UNyiTFKVR MJ^VGGVEHEIL EAftQMIRGE 650 

RCDLEUREI^ ELSELLLSTT QWgVLPCSET 1LPALSIGLI HlfjgNIVDVQ 700 

YLYGJGSSIA SMAIKWEXW LLFtllAEftR VC9CIJ&\MMLi ISQAEAALEN 750 

LVm^AAStA GIH3LVSFLV FPCEAWmCG RWVPGAVYAL vnM)jpr.T.T.T.T. 800 

lALEWSVL DTEVAASCQS VVLVGL24ALT LSPifYKRYIS WCMASCiSYFL 850 

TRVEftQLHVW VPPLNVRQGR DWILLICW HPILVFDTIK LLtAIFGEEW 900 

JLO^SLIKTP YFVFVQGLIil ICftLARKIAG attVOMAIIK IGftLTCIYVY 950 

NHLTOPrm HNGLMXAVA, VEPWFSPME TKLTEWGADT AACODIINGIj 1000 

PVSAERG3EI LLGPADGMVS KGWRLLAPIT A!ffiQQfIK3LL QCinSLTGR 1050 

EKN3VB3EVQ IVSEAIQIFL AlCINGWCWr VYHSAGIRIT ASPKGEVIQM 1100 

YINVDQDLVG WPAPQGSRSL TPCIC G SSDL YLVIPHADy/I PVRRRGDSRG 1150 

SLLSFRPISV Ii<GSSGSPLL CPA3HAV3LF RAAVCIPG^ KAVIFIIVEN 1200 

LGfTIMRSPVF lOSISSPPAVP QSPQVAHLHA PIGSGKSIKV PAAiaAQGKK 1250 

VLVLNESVAA ILGPGAMSK AKGVDENIRr GWKiTi'iLSP ITliSTWaCFL 1300 

ADQQCSQGAY DIIICEECHS IDATSHGIG TVLDQAEIRG APLWLMaT 1350 

PPGSVIVSHP NIEEVftLSrr GEn:PFW2<AI ELEVXKQGRH UPCHSKKKC 1400 

MLAAKLVAL GINAVMrYRS m/SVIPTSS IWVWSIDAL MlGblU U rLfc? 1450 

VUXNICVIQ IVIFSLDPIF TlEiTi'L PQP AVSKT^^RGR 1GRGECPGIXR 1500 

FVAPGERP9G MFDSSVUCEC YDAQOWEL TEAETIVRLR AXMNTPGLPV 1550 

OQEHLEFWEG VFIGLIHIDA HELSQIKQSG ENFPXLVA2fQ ATVCARAQAP 1600 

PESWD3«y«KC LIRLKPIIH3 PTPLLYRLGA \A3SDSVIUIHP riKYIMrCMS 1650 

ADLEWISIW VUJQQJLAMj AfiXCLSIOCV VIVGRIVLSG KPAIIHPEV 1700 

L\QEET3EMEE CSQHLPmQ OIMLAEQFKQ KATHTJCIIAS FHAEVTTPAV 1750 

QINWQKLEW VCaCHMANFIS GIQilAGLSr LPGNPAIASL MAFTAAVTSP 1800 

LTIOaillfN lUQQWyAAQL AAPGAATAFV GAGEAGAAIG SV3JGKVLVD 1850 

JLMJ^fMJJk GALVAFKIK5 GEVPSIHXV NLLPAII^PG ALW37VCAA 1900 
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HjRRHVGPa: GAVQWMNE^ AEASRGNHVS PIHYVEESCA AARVEVH^S 1950 

LTVIQLLi^RL HSfCTSSECIT PC9GSSC«DI WDWICB/LSD FKZWUaKIM 2000 

PQLPGIPFVS OQRG mSVW R GDGIMHIPCH aSAETIGHVK N3IMRIVGPR 2050 

ICHSMrtSGIF PINamGEC TE=!LPAR«KF AIJWEWSAEEY VEIERyGDEH 2100 

YVSGMniNL KCPOQIESPE EPIELDGVFL HREftPPOOPL U?EEX/SFRVG 2150 

UJEXPyGSOL PCEEEEDyAV LTaiUIDPSH PEREAAGRRL ARGSPPSMAS 2200 

SSASQLSAES LKMCERNHD SEDAELTEaN lUWRQEMSaT HRWESENEO^ 2250 

VIII3SFI3ELV AEEDEREVSV PAEUPKSRR EARALPVWtfl EDaQPStA/ET 2300 

WKKECKEPPV VH3CELPPHI SPPVPPERKK RIWL1ES?IL SERLAEtAIK 2350 

SPGSSSrgSI ICXNITISSE PAPSC9CPEDS EWESYSSMPP LHSIGDEDL 2400 

SDG^ifCTVSS GADIECWa: SMSYSSflTDaL VrPC3\AEE]3K LPINALSNSL 2450 

li^fflNLVYSr TSRSAOapC^^: KVTFIRDaVL DSHiHOTi^E VKftAASKWKA 2500 

NL LSVEEaC S LTPEHSMCSK KSXGAm/RC HARKAVRHIN SVWKEIIEDS 2550 

VrPUJi'llMA KNEVPCVQEE KG3RKEAPLI VFHI£WRVC EKMALYEWS. 2600 

K LPLAVM3 SS OTQYSPQQR VEELVQAWKS KKTEMSRSVD IICEDSIVIE 2650 

SDIKIEEAIY QpCDUDPQAR VAIKSLTERL YVQ3PLTMSR GEbUSmCR 2700 

ASGVLTISOG NILTCYIKAR AACRAAGLCP CHMLVCGCDL WICESAGVQ 2750 

EDAASLRAFT EAMIPYSAPP GDPPQPEYDL ELTTSCSSNV SVAHDGAGKR 2800 

Vm-iraJPTT ELARAAWEIA RHTPWNSWLjG NIIMFAPIUW ARMIIMHEF 2850 

SVLIARDQLE QArWZETYGA OTSIEPLDLP PIIQRLH3LS AESLHSXSPG 2900 

EUSERWAACLR KLG7PELRAW RHIRARSVRAR LLSR03RAAI OGKXLENWAV 2950 

RIKLKLTPIA AAGRLDLSGW FEAGYSQC33I ^(HSVSHARPR WFWTCLLLIA 3000 

AG^^aCYLLEN R 3011 
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GOcaQjoDcx: iGAia3333C GftCftciaac ooca^fljcac loxiCTG'mA. so 

G3AACEACIG ICTICAGQCA GAAAGCXSICT POOCKTGGOS ITOGIMGAG 100 

TCicjGiucaG cxnaaoGftC axnrrooc q3gagaqoca •msrosiciG iso 

OGGWmSSr GM3IPCMXJG GAMTOOCaG GACGACX3333 ' lULTi'lLTm 200 
GMCAACXX3G CICftAIQCCT GGAGMTTOS QCXSIQCXXXr QOGfiGftCIQC 250 
mSOCGMSm GIGnOSGflC GCX3AAAGGCX: TBSUGSIIACT OCXnGKEROG 300 
GIQCTIQ03A GIGOCXXXSQG AQSICTOSm GAOOGIQCAC amSAQCftOS 350 
AATOOAAAC CICAAAGAAA AADCftAAOSr ATOO^AOC QCnSOCSCRCA 400 
GGAOGICAAG TTOOOGOQaG GTOaiCAGAT GGflTOGflOSA GnTRCCIGT 450 
TOOOSOGCAG GSaOOCJCAQG TTOaSIGIOC GCX30C3ftCmG GAftOaCITOC 500 
GAQCQGID3C AAGCTOGIQG AM39CX3ACAA C3CiaTOC3CAA ASaC'lULjUOi 550 
AOGCX3ftG3gC AGQSOCiaSG CICAGOOC333 GIMJULT l Uj OOOCTCIMG 600 

GCAA3GAGQG CCTQI333KSG GCM3GAT39C TOCIGICftOC OCX30QQCrOC ^"650 
OQQCXnJftGIT QQQQOCDCAC GGftCmrOG C3G3303ia3C GIAACnQ3G 700 ^' 
TftAQGfTCKCC GMAOXTIA CATOOQQCTr OSCXXSATCIC ATOQQGflftCA 750 ■ 
ITOQQCTOGT OQQOQCGCGC CI2AG3QQQQG CTO0C3!^GQQC CTIGQCACRC 800 
GGIGIODQQG TICIGGA0C3A CX3Q0GIGAAC IMQCftACAG QGAflCITOOC 850 
aSGiTOCICT TTCIUmCT TCCTCITOQC ICIQCIG'ICC TCTTIGACrA 900 
ICOCAQCnC CQCnKIGAA GIQOQCAAaS T3IC30Q3G?a' AIRDCaTCIC 950 

AOSAACGACr OCPXAACIC AAQCftTIGIG 'HOCSMSSX CXSGAOSIGAT lOOO 

CATQCATftCT CXXDQQGIQCXS TOCXXTOIGT TCAGGAQQGT AACftQCIGab 1050 

Ci'IUL'IUQSr AQOCXnCftCr OOCAOTTCG 093CCAGGAA IQCXaOOGIC 1100 

CCTACEftCGA CAATftDGAOS (DCAOSIOSAC TIQCiasnG QGAaaOCTOC 1150 

Tncrocncc GciATCrcAos tqqqqsatct cnxacoGATCr atitigctog 1200 

■ICroOCAQCT GTICAC3CTIC TOQOCIOSCX: GQCftlGftGftC AGIQCftGSC 1250 

TOCAACIQCr CAAIdATOC CXBaXMGIA TCAQGflCACX: GCAK3C3CnG 1300 

QGATftlGATC ATGAACTOGr CAOCIftCAAC AQOOCTRGIG G' lGlCGLa CT 1350 

TOCIDOOGSVr OOCACftAQCT GTOSIGGACA T0SIQQa333 GSOOCACIGS 1400 

QGAGIOCTGG OaOQCXinQC CTRCIMTOC AIQSIAGaGA ACIQQQCiaA 1450 

QGITCIGATr GTOQOQCm: TCTnOOOGS CXnTGAOSaS GRGAOOCACA 1500 

OGMDOOaGftG QGIGaajQQC CACAGCaOCT CXX33GTICAC G ' iOXTmC 1550 

1Cft3CIG333 03IC1CAGAA AMC3CfiQCIT GIGAM3R0CA AOGGCftGC'lG 1600 

QCPCPOCMC AQGACroOQC TAAKTIQCAA T3ACIOX' lC GAAACTOOST 1650 

icrnoocxsc ociGirmc ocACACAAGfr tcaaciosic ogosigooog 170o 

GAG0QCA3Q3 OCftGCIQCXB CQQCAITCAC TOGaTTOQCDC AQ9aGIQQC9S 1750 

(XCEATCACC TftTftCIAAQC CTWOQCIC OCSATCftGAGS CmmOCT 1800 

GoaaraAoac ocxnasacDS i g' i gg ig ' i ug TROOCxxxaic GCAOsroiGr 1850 

QGTCCAGIGT ATTOmCftC CCCAMSOOCT (JV1UJJ33LQ3 QGADCAOOGA 1900 
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TCOTOOTGT GTOOCXftCGT ^13)000333 GGAGAAIG?^ ACAGftOSIGA 1950 

TQCroCICAA CftACAOaCXSr (DOQOCftCAftG QCW03C3IT 03XTCiaCA 2000 

TOGftTCAATA GfEACIQ93IT CftCT^AGAOS TGOSSROSIC QXCGIGDA 2050 

CATa3QQ933 GiaSSDWDC QCAOCTIGAT CIG00CCAO3 GfiCIQCITOC 2100 

03AfiGC3CDC 0GAG3CEACr TftCftCAAAKr GIGXTOOOS G OX ' IOjrm 2150 

ACAOmOGT GOCTOGiaGA CiaOCDCKEftC A33CTn03C ASBmnC 2200 

C30CICAAT Ti'l'lULMLT TD^AOGflTftG GAIIG'IMGIU G33333Ji\n 2250 

AGCaC ftGXT CAATQOOXA T3C3iATIG(3V CIDGfiGSftGA QCX3C3GIMC 2300 

TI GGftGS ACA Q9SAam03IC AGAACICAGC GCX3CIQCIGC TCICBOAC 2350 

AGAGIQQCaG ATftCIQDOCr G'lUL'i'i'iCAC CAOOCmODS G UlTiMU J A 2400 

CTOGfrnGAT (XAlCiUJAT CAGAACftTCG IQSftOGTOCA mOCTCfERC 2450 

QaiGTA033r CAQCX3ITB3r ClUCl'i'iUCA. AICAAATOaS AG33RCA3CX?r 2500 

GnorrmC CnCICJCIOG CftGACXSOOGS C G ' iG ' lG ' iG OC TQCTIGIQSA 2550 

1GA3QCIGCT GAIRGCXXaG QCIGAQQOOG OCTiaGftGAA Ci'iUjiUU 'lL' 2600 

CICAATOOQG CXnJCCGTQOC a33AQ09GAT GGIMTCICT UJlTlUi'lUi ' 2650 

GITCTICIQC QOGGaCTOGr ACAT33^AQ93 CAG X 1GGCT a:TOGaGaa j 2700 

OG?EArocnT TrA3X3QOGnA TXjGCCGCIJGC TOCTQCroCr ACIQQCJSnA 2750 

cjCftCCftOGAG crEftooacrr ggacxx3qgag imsacrocAT 0310093333 28OO 

laoosnciT geaosictog tmttctigac crroicAOCA isotcaaag 2850 

TGrmCICAC TAGQCICMA TQUiUJi'JjAC AMACTTIAT CAC3CAGfiQC3C 2900 

GAQODOCftCA TOCAAGTGflG QSTOQCXXXr CICAADGTIC G333A03QCG 2950 

CX3ATOQCATC ATOCIDCICA anGIQQQGT 1CAI0CAGAG TEWOnTIG 3000 

ACA3CA0CAA ACiaCIQCTC GOCATftCTOG G3CX33CICAT GGIGCTO Cfl G 3050 

GCTCGOmh CGAGAGIGCC GI70TO3IG 03J3CrCMG GXTC M ltG 3100 

T3CA3QCATC TEAGflQOGAA AAGICGOJGS QGSICATiaT GTOGAAKDOG 3150 

TCTTCATCAA GCTO93CG03 CIGACAG3IA CXSDVDCSmA TAAOCMCrr 3200 

ACOrACIQC GQGACia33C CXS^OSDQQSC CIJOaGADC T iUGG m UUL' 3250 

GGraSAQCXr GiCG'lUi'lLT CXXSCXMOSA GAOCAAGSIC ATCAOCIGGG 3300 

GAQCAGACAC C3QCIG03IGr 0333ACATCA ICnOSSICT ACDCGICIDC 3350 

QCXXGAA03G GGAAQGAGAT Al'l'l'i'lU33A CXXaSCIGAIA GIC I 0 3 AAQG 3400 

GCAAQQGIGG OGAC'lUL'l'lG 03C30CATCAC 03003000 CAACAAAaSC 3450 

osoaoGEAcr TOGTIOCAIC ATCACEAQOC TCACAQSOOG QGACAAGAAC 3500 

CAGGTCGAAG QOSAQSTICA AG i GG'i'l'lL T AOOQCAACAC AA llLTi'lULT 3550 

GQCX3A0CIQC ATCAA0QQCX3 TGflQCTQGAC IGICTRDCAT Q3CX3CIG3Cr 3600 

CXSVAGAOXT AGOOQGia^i AAAGGIOSA TCAGOCAAAT 3650 

GTAGACCIOG AOCIDGIQQG CIGGCAGGOG 0000000303 OQOQCIOCAT 3700 

GACADCA10C AGCIUIOQCA GCTOOSACCT ITACTIOSIC AOSAGAGMG 3750 

CTGAT3ICAT TOOGGIQOQC CXaOOGAGOOG ACAGCAQ033 AAGIC I ftC'lC 3800 
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TOOCX3CAQ3C: CDGflCTOdA 0OT3AAAQ3C TOCI0333IG GlOCM'jLUCT 3850 

•nooocnas GgacaoGiios t933cgicit cxjlux'igct GiGicom: 3900 

OJLJGmiaaC G?^A03333IG GfCTICmC ULUi'iUAGIC IMGGAAACT 3950 

AiXAomssr ctooogicit oogacaac TCftAaxccc aaacroraoc 4000 

QCfiGftCAnc cAftGiQ33c ATCiocsm: lazEwrrooc AGcma^fiGA 4050 

QCADCAAAGT QOCI33CIQ3G TMOCaODOC AftQ3SI3RCAA QSIOCfiaGIC 4100 

CIGAAOOCXSr CJOSnQOOQC CADCna03G Ti'iUJyUCJGT TflMGIDCAA 4150 

Q3CACAD3Gfr MOjAOOCnA ACATCftG?^ TO393IMQS AOSmSMXA. 4200 

aaoxox-ic cATiAOGrac TocAcnATC GCAftGnocr TODCxsftoasr 4250 

03CronCIG 0333333Cm T3ACATCMA AIMGIGKIG ASIQCCACIC 4300 

AACIGACTCG AC3303m:n' TQ3C9CAaa33 C3OGI0CTC GMI3W3033 4350 

AGAOSaCTOG AQ0G03QCIC GI0G?IQCID3 (XAOOaCIRC ACCIUJLQGA. 4400 

TO3SmOC3S IQCJCACAOCX: CS^amOaAG GAAM»03CC IGTOCftACAA. 4450 :> 

TOCaCAGMC LUL'i'lL ' JM G QCAAAQOGRT OJOJAii'lGAG QOCSOCMOa 4500- 

G3393iGQCA ' lUlLmTl 'l C TOOCATIDCA AGAftGAAMG TGROSaQCTC 4550 5 

GOC33CAAAQC IGACAQaOCT CQGACTGAAC QCIGfEftGCaT ATTACDQaQS 4600 ' 

OCTIGftlGIG 1D03ICM3C (TQOC'IMQGG AGftOGfTOGflT GID3IQ92AA 4550 

CAGftoacicr m^u&osgst ticadogoog Amrsftcic agigmosac 4700 

T3CAATACAT GIGICAOGCA GACAGflOGAC TICftOCTiaG ATOCXaGCIT 4750 

CftCCftTIGAG ADGACJGftOOG TOOOOCAAGA GQaOGflGTOG QjCiUQC RftC 4800 

GQaS«3C3IAG AACTO3CAG3 QC?IftQ3AGIG 02ATCEACAG GTTIGIGACT 4850 

C3CAQGAGAAC Q30CXna333 CAIGITOGAT TCTKOSKX: T3IGIGAGIG 4900 

CEATCAOSOG G3 L'lUlULTi ' GGfEKTSAQCT CA0300G3CT GAGACCICX3S 4950 

TTAQGnOOG OSCrmXHIk AATRCAC3CAG QGnOOGOGT CIGCTAGSftC 5000 

CATCIQGAGT ICTOGGAGAG OGICITCftCA OXCICACXX: ACATAGKIQC 5050 

exaC i ' l UL'l G TOCCAGACEA AACAG3CAQ3 AGACAACTIT 0CITACCB33 5100 

TOQCJaATCA J!«X!I30^GK3 TQ0Q0atf3^ CICAAQCTOC AaTTOCAaOS 5150 

T333A0C?\AA T3IOGAAGIG TCICATAOQS CIGAAACCIA CPCIGCMX3G 5200 

G0CAAGAOC3C ClUL'lUIM A GC3CI»GSAQC: CX3I0CAAAAT GAOGICAIIQC 5250 

TCACACAOX CAIAZ\CIAAA. TACATCAIQS CATOCA3GIC GGCIGAOCTC 5300?,. 

GAQGfTOGJICA ClftGCMnG GGIGCIGGIA OaOOGAGIOC TIQC»GCnT 5350:,; 

GQCXX3CA33AC T30CK3AOGA CAQQCAGIGT GSICATIGIG QC3GAQGA!ICA. 5400 

'ICriGlCCGG <3>ia30M3Zr UlUii'lUCOj ACAG3GAAGr CCiCJiACCAG 5450 

GAGrrGGATC AGATOGAAG^ GnPGrTOOCICA CAALTIOJIT AC AICGAGC A. 5500 

GQGAAIOCAG CKX3QCX3AQC AATICAAOCA AAAQaOQCTC QQGTIGITQC 5550 

AAA033CrAC CAAGCAAC30G GAGQCTOCIG CTOCJOGflQSr GGAGT OCAAG 5600 

TOQCX3AQ00C TIGAGAOCIT CIQQQCX3AAG CACAIGIQSA ATTICAICAG 5650 

CXXSAAIIACAG TAOCEAGCAG QCnMOCAC TCIGOCnSGA AAOCXXDOCXSA 5700 
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TftGCftTCftTT caroocaTrr ACftGcncm ucpcimxxx: ocrcaccftoc 5750 

CAAAftCADCC TOCTGTmA 00010333 QGAIGSSIOS CIQCXXS^ACr 5800 

OGcnncxr AacxsciGosr CAocmoGr gssjsojjx: ArooocxasftG ssso 

ooxrcriGG CM3oma3c cnoasAAos iGcrcmosA caTCnaooG 5900 

Q3CEMEG393 CM3333SmC 093CX3CftCIC G?ro3CCnm AQ3ICRTCftG 5950 

O^OSftGSIG CXnoCAOCE AQGAOCIGSr CAACmCIC CXTOCX3VTOC 6000 

TCICIDCIUy TOOOCIOSIC GIOGQGGlCJ a 1GIGa3C3tf3C MmcnxXST 6050 

CXSSCaOGflGG QCmaaGaGA OOGOtaCIGIG CAGIQGMGA. A0 3 3XUUftr 6100 

AQOiTiuucr TOQcxsaasfm aomsicic cxxfiaarac TOiGiaaciG eiso 

AaoasAoa: TOCTtfKaa^T Gicft^^ 6200 

ACICAACTOC: TCAWmSCr OCAOCAGIQS MTAKIGRGS jOGL'iCiaC 6250 

^MXTCC Q3CI0GIQQC TAAOSSKIGr TIQ93ATIQS MMOCJAOSS 6300 

IGITSOGA CnCAA GAOC laOCICCfiGT OCAAACTOCT QOaaaasriA 6350 

CD393AGIDC CmUL'iUlC ATQ0CAACX3C QQGIITOAQG GAGICiaaOG 6400 

090330330 A3CATOCAAA CXSiOCIOaCC AIQCOGAGCA CAGATOQOOS 6450 

GACA3UICAA AAACX33nCC ATCAQGKDDG TAQ3QCXICAG AAOCTOC»QC 6500 

AACftCGTOQC AQQGAAOGfIT 00C3CATCAAC GCATACAOGA CmSACC Tl U 6550 

CACACOCTOC OCX330QCm^ ACIMTOCAG GQOaCIMQS 0Q3SiaaCIG 6600 

CIGAQ3AGIA CX3IQ3AOSIT AOSCGIGIOS OaGMTTOCA CTaOSIGACXS 6650 

GGCM^CJCA CIGACAAOSr AAAGIOOOCA TOOCfiGSITC CJGGOacmA 6700 

Ai'lu TiuftCX; GA0GTOGA3G GA&' I UCG G 'IT GCACAGSEAC QCIODQCXXSr 6750 

QCAAACOCT TCIADQC3GAG GAOGICADGT lOCAGGfTOGS QCICAAC3CAA 6800 

TACnOGTOG Q3ia3CAQCr OOCAaOOGAG Cm3AA003G ACGEAACAGfT 6850 

GCTEACno: ATQCICAOQG ATOOCTOOCA CATERCAQCA GAGAOSSCTA 6900 

AGOS IAGGCT GSCEAGAQQG TCICCOCOIT CnTAQCJCAG CTCAICaGCT 6950 

AQC3CAGnGr CIQOGOCTIG TnGAAQQOG ACAIQCACEA OOCAOCATCA 7000 

CICXXX33GAC QCIGAOCICA TOGAQGCiaA OCICITSIQG C3Q3a03fiGA 7050 

1 G33333AAA CATCACTOQC GIC3GAGICAG AGAATAAQST AGfmATICIG 7100 

GACICmaS AAGOacnCA CXXX3GAQ3Q3 GA3GAGAQQG AGAaMOOCT 7150 

OGQGOOQGAG ATCCIGOGAA AATOCAGGAA Ul'l UUL Ul L A (JUL ji'iUJLXA 7200 

TA.T33QCAQG 0C033AGIAC AATOCTOCAC TQCmSAGIC CIQSAAGGAC 7250 

OOQGACEAOG TCDCIDOGGr QGEACAOGGA TOCXXaTIGC CAOCTADCAA 7300 

QGCTOCTOCA ATAIXACXnC CA03GAGAAA GAQGAOSSIT GIOCIGACAG 7350 
AATOCAATCT GTCTICIQOC TIQQCX3GAQC TOQCTACEAA GA ULTIUU GT . 7400 

AQCTOCG3AT OGICQGOCGT TGATAG033C AOQ90GAOCX3 CCC'i'iUL'iUA 7450 

ocroaocioc gaogaoggig acaaaggatc asAasnoAG lanaciai'i ' 7500 

Ca^FGCCCCC 0CTIGAA03G GAG0CX3QG3G ADOOCXSAaCT CAG03ADQ33 7550 

ICTIQSICIA ODGIGAGIGA GGAOGCTAGT GAQGAIGTOG 'ICTOCiUL'lL ' 7600 
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AAIGIDCmr AOSIQGftCftG GQQQOL'iUAT CAOaXMGC QCIQD33ftQ3 7650 

AAAGaaftGCr QOCXaiCAftC CVJbTlUAQCA AUlCTi'lULT QOSTCaCCaC 7700 

MCAKGICr AOXTftCAAC AIOCXXSCftQC GCRft GGC'XaC QQCftGAAGAA 7750 

GSICAICTIT GACftGATIQC AftGTOCIQSA IGMCKTEAC 0393A03EM: 7800 

ICAAGSftGftT GaAG333?VAG QOGJKJCACS^ TEAfiG3CmA QJi ' iUlAlLT 7850 

2m£3M30N33 CXHOCAAGCT GA03CXXrCA GftTTCCQXA AMCXaASIT 7900 

IQOCTOQSS GCAAfiOSftOS TOOCSSAACCr KTCCPGCMSG GXXSmftOC 7950 

AI2ATOCX3CTC 03IGIG93AG GAC'i'iUUlU5 AftOOOGA AftOtfXAATT 8000 

TCAIQQG?^ AAGTOftQGflT TICIQaSIOC AftCX^RGftGAA 8050 

G33M33JJ3C AAGOCAQCTC QGCmTOST ATIOOCRGftC ClUCGftGnC 8100 

GiGmrocGii (3VAGAroc9cx: crmosftOG iosiciixac ccnociCAG siso 

QCXDGIGAIOS GCICCiCMA OOSKTITCAA. TACICXX3CX3V AQCRQC3933r 8200 

CXSAGTIOCIG GIGAMftOCT OSAATCAAA GAAMQCXJCT MQQQCTICr 8250 

CftlMGACAC CCGCUUi'iTi' GACICAAOaS TCACIGAfaG TCACMTCXSfr 8300 

GITSftGoAGr CAATmOCA AIGTIGIGAC ITOaOOOOCG AQOOCftGACA 8350 

GOOSm^AGS TOQCICACAG AQOGQCTnA CPnD33333T CXXCIGACm 8400 

ACICAAAAGS QCAGAACIGC GGl'lMLGOC GSIGOOGCXX: AfiGIQQOSIG 8450 

CIGA03ACIA QCIGCX3GIAA TSmUCACA TGTEOTIGA AQgOCACIQC 8500 

AQOCIGTOGA QCIGCAAAOC TOCAQ3ACIG CA0GA3QCTC GIGAACX3GAG 8550 

AQGAOCnur OGTIMClGr GAAAQOQOOG GAA002AGGA QGAIGCOaaG 8600 

OCXXrCACGAG CmCAOSSA QQCEATCACT AGGIMTODG C30GOC3CX3033 8650 

GGKKTQOCX: CAAC3CAGAAT AOGADCIQGA. GCTGAIAACA TCAIGnOCT 8700 

OCAATOIGIC AGIOQCGCAC GAIGCATCIG GCAAAA03Sr ABCmOCIC 8750 

AQGOGIGAQC CCACrACOOC CCTIGCAOSS QCIQOGIGaS AGACAGCEAG 8800 

ACACACroCA ATCAAC3CIT QQCIAGQCAA lATCAICAro TMQCQ0C3CA 8850 

CXDC1MG99C AAQGATCATT CDSAIGACIC ACTITnCIC GAI OJi'lUlA 8900 

GCICAAGAGC AACTIGAAAA AGOQCIGGAT TSICAGATCT AaSGQQCTIG 8950 

CTACroCATT GAGDCACTIG AOCEAQCTCA GAICATIGAA C3GACIDCAIG 9000 

GIUmOOQC ATTEACACIC CACAGTEACT CroCAOSIGA GATCAATAGG 9050 

GIQQCnCAT G0CICA0C3AA ACnGQSGIA C3CACX3CnQC GAACCIGSAG 9100 

ACAaOSQQX AGAAGIGICC QCX3CTAAQCr ACIGTOOCAG QGQQQGAGaS 9150 

CCX3QCACnG TOCSCAGATAC CICITEAACr QOGCAGiaAG GAGCAAGCIT 9200 

AAACICACIC CAAIQGOQQC OaCGIOCXAG CIGGAL'i'lUr CIGG C 'lG Cj l T 9250 

OSroaCIQGT TACAQCXSQGS GAGACATATA 1CACAQ0CIG 'IC'lOGiCOX' 9300 

GACXXXX3CIG GmDOGfTIG TOOCIACia: TACmciGT M3333m332 9350 

ATmOCIGC TOGXTAAGOG ATGAAaQ03S AQCTAAOCAC TOGAOGOCIT 9400 

AAGQCATTIC ClGl'l'iTl'iT 'I'lTi'i'i'iTlT TITTnTnT TCITnTnT 9450 

•iTlUTlUCr 'i ' lCL ' l'IL'i'l ' r Tl'l ' lULTl ' lC U ' m ' lUUL ' iT CrrEAAIIQGr 9500 
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QQCICCftTCT mSOCCmsr CM333Cm3Z TOIGftAflOSr CX33KS«30G 9550 
CAIGftCIQCA GftGfiGflQCIG ADSOOaXT CICTQCAGAT CMXST 9595 

FIG. 7F 
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MSTMKP^RK TORNINRRPQ EWKFK333QI VQSVYLLRRR GHRDSWRftlR 50 

KASERSQERG RF?QPIEKARR EEGaRAWftQEG YIWELYGNEB ICWRGWtlSP 100 

RSSRESWGPT DEIIRRSPND3 KVlUll/iUUF ADUCKIELV GAEDQSW^A 150 

LAH3VFVLED G\^^2Q^IO€P OCSESIFIIA LLSCLUPAS A^EVRNVSSI 200 

YHVIMTSNS SIVYEAAD;/! MEJTPQ^^ 250 

ASVPirriRR HVDEJJVGIAA PCSfiM^VGDL OSSIFLVSQL FIFSEKREiEr 300 

MPOCSIYP C3HV9GHRMfiW rMM^flSPTT ALWSQLtHI PQKWEMMftG 350 

AHWGR/LAGLA VYSMVO^rtftK VLTmiFMS VDCEIHTIGR 400 

SLFSSGA9QK IQLVNIN3SW HINRiaUO^ DSLQnSEAA IFaHKENSS 450 

QCPERMA9CR PIDWE?\CaGWG PTIYIKENSS DCS^PKymA ERPOSWPAS 500 

QVaSEVKFT ESPVWGnD P93VPIYSWG ENEIIMCIN NIREEQGMWF 550 

QCIWMNSIGF IKroOGPPCN IGGVGMRIU CPlJJCb'KKHP EamKCGSS 600 

JWLTERCLVD VPXRLWHYPC TLNFSEFKWR M^VQSVEHRL NAACNWTOGE 650 

IOJr.EmERS ELSPLLLSrr EWanfCAPT TLPALSIGLI flLFISpm/DVQ 700 

YLVGVGSAFV SFAEKWEYIL LLfUJLADAR VCACIJ'«MIi lAQAEAALEN 750 

LWIZOAASVA GAHSILSFIiV FFCAAWmG PLAPGAM!aF V3VWPLLLLL 800 

lALPERftXAL EREMAASa9S AVLVGLVFLT LSFVYKVELT FLIV*flLjgYFI 850 

TRAEAH^QVW VPELNVRQOl EMIIIJICAV HPELIFDnK UJLMJJSPLM 900 

VLQAGTIRVP YFVRAQGJjrR ACMLVRKWftG OliTOWFMK DSMJIGIWy 950 

NHLTELiOC^ HAGUCtAVA. VEPWFSAME IKVTIWGRDr AAOGDIILGb 1000 

PVSARRGKEI FLGEftDSLEG QOWFUAPIT AirSQ3IPGWL QCinSLTGR 1050 

EKNQVEGE^ WSIMQSEL AICINSVCWT VYH3AGSKIL PaeSfGPTBSA UOO 

YINVnULVG K3APPQAR31 TPCS03SSDL YLVIRHAEWI PVRRR3DSRG 1150 

SLLSERPVSY LK3SSQGPLL CPSGHWGVF RAAVCIPGWA KAVDFTPVES 1200 

METIMRSFVF TCNSTPPAVP C?m3VAHIiiA PIG9GKSIKV PAAXAAQ3XK 1250 

VLVU>3PSVAA TUGFGAYMSK AHSIDENIPT G VKl ' i ' l ' lUG S nVSIYGKFL 1300 

ADSXSQGAY DIIICEOCHS TDSmUSIG TVLDaAEERG ARLWIAIRT 1350 

PPGSVTVFHP NTRRTQ-.SNN PIEAIK3GRH KLPCHSKKKC 1400 

DEIAAKLTGL GUqAVAY««3 3XVSVIPPIG DWWAIEftL MiUt'IUUbUS 1450 

VUXNICVIQ TVDFSUDPIF TiEmVPgD AVSRS^mi TGI«219GIYR 1500 

FVTK3ERP9G MEDSSVrm: "5fl»QCR^^ 1550 

VPTSUIHUA HFLSQIKQftG ENFPXLVftSQ MVCRRAC3M» 1600 

PPSWD3^WKG LIRLKPTLiC PTELLTFKA VQNEVILTHP HKYIMACMS 1650 

ADLEWrSIW VLVQGVLAAL AAYCUnGSV VrVGRIH^ KPAWPCREV 1700 

LYQEFIXMEE CASQLPmQ GM^LAEQFKQ KAIOIJ^IAT KQftEftAAPW 1750 

ESKWRALEIF V^^KHMANETiS GIQifLAGLSr LPaJPAIASL MAPEASTISP 1800 

LTTCNILLFN IDQGN^^^AC^ APPSAASAFV GAGIAGAAM3 SIGDSKVLVD 1850 

HAGVGAGVA GWIWAFKVMS GEVPSIEELV NLLPAILSPG ALWGWVCAA 1900 
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m^WGPGE GaW3'WNRLI AFASRGNHVS PJmVPESDk AmriQUSS 1950 

LTriQtIi<RL H3WENEDCST KSGSWtlW WDWICIVLTD FKCWDQSKLL 2000 

ERLPGVEELS CJQRGilKGfVWR GDGIM3nCP OSRQIftGHWK M3SMR3V3ER 2050 

TCSNIVW3IF PINftmOtC TPSPAENXSR AIMFWAAEEY VEVIRVCaiH 2100 

WICMITCNV KCPOCVPAEE FFIEWDGWRL HRKAPACKPL IMEVn^ 2150 

INSiliVGgSL, fCfcPEtiVIV LTSMLTOSH rCREEWKRRL AK3SPPSLAS 2200 

SS ASQLSRB S IfO^ICnHHD SSOMJUmN UmSSS/OOH UmfESEIim 2250 

VELDSFEPIH AEGCEREISV AAEILRKSRK BPSALPBaR PCKNP PTJRQ 2300 

WKDEDSfVEEV VHTELPPIK AEPIEEERRK RIWDIESN7 SSMAmVEK 2350 

1H3S93SSAV DSG?I3MaLED LASCDGCKGS EWESXSSMEP LSSKSDEDL 2400 

SDGa«SIVSE EASETWCCS MSOTWIGALI TPC3\AEESKL P3NEL£NSLL 2450 

RHHtvWVimrT SRSASE«aKK VIFERriQVLD EHXREWLKEM KAKASIVKAK 2500 

U iSIEEAC KL TPJHSAKSKF GYGftKCWRNL SSRAVNHIES VWHU^DIE 2550 

TPIDITIMAK SEVPCVQEEK GGRKPARLIV FEDLCWRWCE KMMMWST 2600 

LPQAVMSSSy GPCf5fSEKQRV EELWHWKSK KCEMSFSyDT KSDSIVIES 2650 

nOWEESIYQ OCDLAEEaRQ AIESL1ERLY IGGKLflNSKG gCGliHRCRA 2700 

SGWLTISOGN ILTCYLKMA ACEy^AKLCPC TMLVNOXLV VICESAGigS 2750 

DAAALRAPIE AMTRYSAPKS DPFQEEMZE LTISCSSNVS VAHDftSGKRV 2800 

YYLIFDPITP tARAfiWEIftR HTPINSlAlLGN HM^ftPHMl RMIIMIHEFS 2850 

mAOEQLSK ALDOQIYGAC YSIEPLDLPQ IIEI^IHSE^ FHHSYSPGE 2900 

INRWASCLRK LG7PELKIWR HRARSVRAKL GRXLPN&CWR 2950 

TKLKLTPIPA ASQLCL9GWF \mG^333DT£ HSLSRARPFW FFUdlUSV 3000 

FIG. 7H 
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SEQUENCE LISTING 



<110> Yanagi, Masayuki 
Emerson, Suzanne 
Bukh, Jens 
Purcell, Robert 

<120> Cloned Genome of Infectious Hepatitis C Viruses of 
Genotype 2a and Uses Thereof 



<130> 20264302PC 

<140> TBA 

<141> 2000-06-02 

<150> 60/137,693 
<151> 1999-06-04 

<160> 39 



<170> Patentin Ver. 2.1 



<210> 1 
<211> 9711 
<212> DNA 

<213> Hepatitis C virus 
<400> 1 

acccgcccct aataggggcg acactccgcc 
cttcacgcag aaagcgtcta gccatggcgt 
ccccctcccg ggagagccat agtggtctgc 
aagactgggt cctttcttgg ataaacccac 
caagactgct agccgagtag cgttgggttg 
tgcttgcgag tgccccggga ggtctcgtag 
tcaaagaaaa accaaaagaa acaccaaccg 
cggccagatc gttggoggag tatacttgtt 
cgcgacaagg aagacttcgg agcggtccca 
agatcggcgc tccactggca aatcctgggg 
gaatgaggga ctcggctggg caggatggct 
gggccccaat gacccccggc ataggtcgcg 
gtgcggcttt gccgacctca tggggtacat 
cgccagagct ctcgcgcatg gcgtgagagt 
gaacttaccG ggttgctcct tttctatctt 
cccggtctcc gctgccgaag tgaagaacat 
caccaatgac agcattacct ggcagctcca 
cccgtgcgag aaagtgggga atgcatctca 
cgtgcagcgg cccggcgccc tcacgcaggg 
gtccgccacg ctctgctctg ccctctacgt 



atgaatcact 


cccctgtgag 


gaactactgt 


SO 


tagtatgagt 


gtcgtacagc 


ctccaggccc 


120 


ggaaccggtg 


agtacaccgg 


aattgccggg 


180 


tctatgcccg 


gccatttggg 


cgtgcccccg 


240 


cgaaaggcct 


tgtggtactg 


cctgataggg 


300 


accgtgcacc 


atgagcacaa 


atcctaaacc 


360 


tcgcccacaa 


gacgttaagt 


ttccgggcgg 


420 


gccgcgcagg 


ggccccaggt 


tgggtgtgcg 


480 


gccacgtgga 


aggcgccagc 


ccatccctaa 


540 


aaaaccagga 


tacccctggc 


ccctatacgg 


600 


cctgtecccc 


cgaggttccc 


gtccctcttg 


G60 


caacgtgggt 


aaggtcatcg 


ataccctaac 


720 


occtgtcgtg 


ggcgccccgc 


tcggcggcgt 


780 


cctggaggac 


ggggttaatt 


ttgcaacagg 


840 


cttgctggcc 


ctgctgtcct 


gcatcaccac 


900 


cagtaccggc 


tacatggtga 


ctaacgactg 


960 


ggctgctgtc 


ctccacgtCG 


ccgggtgcgt 


1020 


gtgctggata 


ccggtctcac 


cgaatgtggc 


1080 


cttgcggacg 


cacatcgaca 


tggttgtgat 


1140 


gggsgacctc 


tgcggtgggg 


tgatgctcgc 


1200 
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agcccaaatg ttcattgtct cgccgcagca 
catctaccct ggtaccatca ctggacaccg 
gcccacggct accatgatct tggcgtacgc 
cattagcggg gctcattggg gcgtcatgtt 
gtgggcgaaa gtcgttgtca tccttctgtt 
tgttgggggt tctgccgcgc agaccaccgg 
caggcagaaa atccagctcg ttaacaccaa 
gaactgcaat gactccttgc acaccggctt 
caacccgtca ggatgtcccg aacgcatgtc 
gggatggggc gccttgcaat atgaggataa 
ttgctggcac tacccaccaa ggcagtgtgg 
agtgtactgt ttcaccccca gcccagtggt 
cacttacacg tggggggaga atgagacaga 
gctggggtca tggttcggct gcacgtggat 
cgcaccaccc tgccgtacta gagctgactt 
ggactgtttt aggaagcatc ctgataccac 
cacgccaagg tgcctgatcg actaccccta 
ctataccatc ttcaaaataa ggatgtatgt 
atgcaatttc actcgtgggg atcgttgcaa 
tcctttgttg cactccacca cggaacgggc 
cgccttgtcg actggtcttc tccacctcca 
tggcctatca cctgccctca caaaatacat 
cctgctctta gcggacgcca gggtttgcgc 
ggccgaagca gcactagaga agctggtcat 
tggcttccta tattttgtca tctttttcgt 
ccccttagct acctattccc tcactggcct 
gccccaacag gcttatgctt atgacgcatc 
ggtaatgatc actctcttta ctctcacccc 
gtggtggttg tgctatctte tgaccctggg 
tatgcaggtg cgcggtggcc gtgatggcat 
tgtggtgttt gacataacca agtggctctt 
aggtgctttg acgcgcgtgc cgtacttcgt 
catggcaagg catctcgcgg ggggcaggta 
gtggactggc acttacatct atgaccacct 
cctgcgggac ctggcggtcg ccgttgagcc 
oattgtctgg ggagcggaga cagctgcttg 
cgcccgactt ggtcgggagg tcctccttgg 
gagtcttctc gcccccatca ctgottacgc 
agtggtgagc atgacggggc gcgacaagac 
cacagtcact cagtccttcc tcggaacato 
tggagctggc aacaagactc tggccggcto 
tgctgagggg gacttagtag ggtggcccag 
cacgtgtgga gcggtcgacc tgtacctggt 
aagacgcggg gacaaacggg gagcgctact 
gtcctcagga ggcccggtgc tatgccccag 
tgtgtgctct cggggcgtgg ctaagtccat 
cgtcacgcgg tcccccacct ttagtgacaa 
tcaggtcggg tacttgcatg ccccgactgg 



ccactggttt gtccaagact gcaattgctc 1260 
catggcatgg gacatgatga tgaactggtc 1320 
gatgcgtgcc cccgaggtca ttatagacat 1380 
cggcttggcc tacttctcta tgcagggagc 1440 
ggccgccggg gtggacgcgc gcacccatac 1500 
gcgcctcacc agcttatttg acatgggccc 1560 
tggcagctgg cacatcaacc gcaccgccct 1620 
tatcgcgtct ctgttctaca cccacagctt 1680 
cgcctgccgc agtatcgagg ccttccgggt 1740 
tgtcaccaat ccagaggata tgagacccta 1800 
cgtggtctcc gcgaagactg tgtgtggccc 1860 
agtgggcacg accgacaggc ttggagcgcc 1920 
tgtcttccta ttgaacagca ctcgaccacc 1980 
gaactcttct ggctacacca agacttgcgg 2040 
caacgccagc acggacctgt tgtgccccac 2100 
ttacctcaaa tgcggctctg ggccctggct 2160 
caggctctgg cattacccct gcacagttaa 2220 
gggaggggtt gagcacaggc tcacggctgc 2280 
cttggaggac agagacagaa gtcaactgtc 2340 
cattttacct tgctcttact cggacctgcc 2400 
ccaaaacatc gtggacgtac aattcatgta 2460 
cgtccgatgg gagtgggtaa tactcttatt 2520 
ctgcttatgg atgctcatct tgttgggcca 2580 
cttgcacgct gcgagcgcag ctagctgcaa 2640 
ggctgcttgg tacatcaagg gtcgggtagt 2700 
gtggtccttt agcctactgc toctagoatt 2760 
tgtgcatggc cagataggag cggctctgct 2820 
cgggtataag acccttctca gccggttttt 2880 
ggaagctatg gtccaggagt gggcaccacc 2940 
catatgggcc gtcgccatat tctacccagg 3000 
ggcggtgctt gggcctgctt acctcctaaa 3060 
cagggctcac gctctactga ggatgtgcac 3120 
cgtccagatg gcgctactag cccttggcag 3180 
caccGctatg tcggattggg ctgctagtgg 3240 
tatcatcttc agtccgatgg agaagaaagt 3300 
tggggacatt ttacacggac ttcccgtgtc 3360 
ccoagctgat ggctatacct ccaaggggtg 3420 
ccagcagaca cgtggccttt tgggcaccat 3480 
agaacaggct ggggaaattc aggtcctgtc 3540 
catctcgggg gttttgtgga ctgtctacca 3600 
acggggtccg gtcacgcaga tgtactccag 3660 
cccccctggg actaaatctt tggagccgtg 3720 
cacgcggaac gctgatgtca tcccggctcg 3780 
ctccccgaga cctctttcca ccttgaaggg 3840 
gggccacgct gtcggagtct tccgggcagc ,3900 
agatttcatc cccgttgaga cactcgacat 3960 
cagcacacca cctgctgtgc cccagaccta 4020 
cagcggaaag agcaccaaag ttcctgtcgc 4080 
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atatgctgct caggggtaCa aagtgctagt 
gtttggggcg tacttgtcta aggcacatgg 
gactgtgacg accggggcgc ccatcacgta 
gggctgtgcg ggcggcgcct acgacatcat 
taccaccatc cttggcatcg gaacagtcct 
aactgtgctg gctacagcta cgccccctgg 
ggaggtggcc cttgggcagg agggcgagat 
ttacatcaag ggaggaagac atctgatctt 
cgcggcggcc cttcggggta tgggcttgaa 
ctccgtaata ccaactcagg gagacgtagC 
gtatactggg gactttgact ccgtgatcga 
cttcagttta gaccccacat tcaccataac 
acgtagccag cgccggggtc gcacgggtag 
cactggtgag cgagcctcag gaatgtttga 
aggggccgca tggtatgagc tcacaccatc 
caacacgccc ggtttgcctg tgtgccaaga 
cggcctcaca cacatagatg cccacttcct 
cgcatactta acagcctacc aggctacagt 
ctgggacgtc atgtggaagt gtttgactcg 
tctcctgtac ogcttgggct ctgttaccaa 
atacatcgcc acctgcatgc aagccgacct 
agggggagtc ttggcggccg tcgccgcgta 
cggccgcttg cacattaacc agcgagccgt 
ggcttttgat gagatggagg aatgtgcctc 
gatagccgag atgctgaagt ccaagatcca 
tcaagacata caacccactg tgcaggcttc 
acacatgtgg aacttcatta gcggcatcca 
gaaccctgca gtagcttcca tgatggcgtt 
aagcaccact atccttctca acattttggg 
cgcgggggcc actggcttcg ttgtcagtgg 
cttaggtaag gtgctagtgg acatcctggc 
cgtcgcattc aagatcatgt ctggcgagaa 
gcotggaatt ctgtctccgg gtgccttggt 
ccgacacgtg ggaccggggg aaggcgccgt 
ttccagagga aatcacgtcg cccccaccca 
tgtgacccaa ctacttggct cccttaccat 
gattactgag gactgcccca tcccatgcgg 
ggtttgcacc atcctaacag actttaaaaa 
gcccggcctc ccctttgtct cctgtcaaaa 
catcatgacc acacggtgtc cttgcggcgc 
catgagaatc acggggccta agacctgcat 
ttgttacacg gagggccagt gcgtgccgaa 
gagggtggcg gcctcagagt acgcggaggt 
aggactcacc actgataact tgaaagtccc 
ctgggtggac ggagtgcaga tccataggtt 
tgaggtctcg ttctgcgttg ggcttaattc 
ccctgaaccc gacacagacg tattgatgtc 
ggagactgca gcgcggcgtt tagcgcgggg 



gcttaatccG tcagtggctg ccaccctggg 4140 
catcaatccc aacattagga ctggagtcag 4200 
ctccacatat ggcaaattcc tcgccgatgg 4260 
catatgtgat gaatgccatg ccgtggactc 4320 
tgatcaagca gagacagctg gggtcagact 4380 
gtcagtgaca accccccacc ccaacataga 4440 
ccccttctat gggagggcga ttcccctgtc 4500 
ctgccattca aagaaaaagt gtgacgagct 4560 
ctcagtggca tactacagag ggttggacgt 4620 
ggtcgtcgcc accgacgccc tcatgacagg 4680 
ctgcaacgta gcggtcactc aagttgtaga 4740 
cacacagatt. gtccctcaag acgctgtctc 4800 
gggaagactg ggcatttata ggtatgtttc 4860 
cagtgtagtg ctctgtgagt gctacgacgc 4920 
ggagaccacc gtcaggctca gggcgtattt 4 980 
ccatcttgag ttttgggagg cagttttcac 5040 
ttcccaaaca aagcaatcgg gggaaaattt 5100 
gtgcgctagg gccaaagccc cccccccgtc 5160 
actcaagccc acactcgtgg gccccacacc 5220 
cgaggccacc ctcacacatc ccgtgacgaa 5280 
tgaggtcatg accagcacat gggtcttggC;^5340 . i 
ttgcctggcg accgggtgtg tttgcatcat :5400 v;. 
cgttgcgccg gacaaggagg tcctctatga 5460 
tagggcggct ctcattgaag aggggcagcg 5520 : 
aggcttattg cagcaagctt ccaaacaagc 5580 
atggcccaag gtagaacaat tctgggccaa 5640 
atacctcgca ggactatcaa cactgccagg 5700 
cagtgccgcc. ctcaccagtc cgctgtcaac 5760 
gggctggcta goatcccaaa ttgcaccacc 5820 
cctagtggga gctgccgtag gcagtatagg 5880 
agggtatggt gcgggcattt cgggggctct 5940 
gccctccatg gaggatgtcg tcaacttgct 6000 
agtgggagtc atctgcgcgg ccattctgcg 6060 
ccaatggatg aatagactca ttgcctttgc 6120 
ctacgtgacg gagtcggatg cgtcgcagcg 6180 
aaccagcctg ctcagaagac tccacaactg 6240 
cggctcgtgg ctccgcgatg tgtgggactg 6300 
ttggctgacc tccaaattat tcccaaagat 6360 
ggggtacaag ggcgtgtggg ccggcactgg 5420 
caatatctct ggcaatgtcc gcttgggctc 6480 
gaatatctgg caggggacct ttcctatcaa 6540 , 
acccgcgcca aactttaagg tcgccatctg 6600 
gacgcagcac gggtcatacc actacataac S660 
ctgccaacta ccGtctcccg agttcttttc 6720 
tgccGccaca ccgaagccgt ttttccggga 6780 
atttgtcgtc gggtcccagc ttccttgcga 6840 
catgctaaca gatccatctc atatcacggc 6900 
gtcaccccca tccgaggcaa gctcctcggc 6960 
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gagccagcta tcggcaccat cgctgcgagc cacctgcacc acccacggca aagcctatga 7020 
tgtggacatg gtggatgcta acctgttcat ggggggcgat gtgactcgga tagagtctgg 7080 
gtccaaagtg gtcgttctgg actctctcga cccaatggtc gaagaaagga gcgaccttga 7140 
gccttcgata ccatcagaat acatgctccc caagaagagg ttcccaccag ctttaccggc 7200 
ctgggcacgg cctgattaca acccaccgct tgtggaatcg tggaaaaggc cagattacca 7260 
accggccact gttgcgggct gtgctctccc tcctcctagg aaaaccccga cgcctccccc 7320 
aaggaggcgc cggacagtgg gcctaagtga ggactccata ggagatgccc ttcaacagct 7380 
ggccattaag tcctttggcc agcccccccc aagcggcgat tcaggccttt ccacgggggc 7440 
gggcgctgcc gattccggca gtcagacgcc tcctgatgag ttggcccttt cggagacagg 7500 
ttccatctct tccatgcccc occtcgaggg ggagcttgga gatccagacc tggagcctga 7560 
gcaggtagag ccccaacccc ccccccaggg gggggtggca gctcccggct cggactcggg 7620 
gtcotggtot acttgctccg aggaggacga ctccgtcgtg tgctgctcca tgtcatactc 7680 
ctggaccggg gctctaataa ctccttgtag tcccgaagag gagaagttac cgattaaccc 7740 
cttgagcaac tccctgttgc gatatcacaa caaggtgtao tgtaccacaa caaagagcgc 7800 
ctcactaagg gctaaaaagg taacttttga taggatgcaa gtgctcgact cctactacga 7860 
ctcagtctta aaggacatta agctagcggc ctccaaggtc accgcaaggc tcctcaccat 7920 
Srgaggaggot tgccagttaa ccccacccca ttctgcaaga tctaaatatg ggtttggggc 7980 
taaggaggtc cgcagcttgt ccgggagggc ogttaaccac atcaagtocg tgtggaagga 8040 
cctcctggag gactcagaaa caccaattcc cacaaccatt atggccaaaa atgaggtgtt 8100 
ctgcgtggac cccaccaagg ggggcaagaa agcagctcgc cttatcgttt accctgacct 8160 
eggcgtcagg gtctgcgaga agatggccct ttatgacatt acacaaaaac ttccteaggc 8220 
ggtgatgggg gcttcttatg gattccagta ttcccccgct cagcgggtag agtttotctt 8280 
gaaagcatgg gcggaaaaga aggaccctat gggtttttcg tatgataccc gatgctttga 8340 
ctcaaccgtc actgagagag acatcaggac tgaggagtcc atatatcggg cctgctcctt 8400 
gcccgaggag gcccacactg ccatacactc gctaactgag agactttacg tgggagggcc 8460 
tatgttcaac agcaagggcc aaacctgcgg gtacaggcgt tgccgcgcca gcggggtgct 8520 
caccactagc atggggaaca ccatcacatg ctacgtgaaa gccttagcgg cttgtaaagc 8580 
tgcagggata atcgcgccca caatgctggt atgcggcgat gacttggttg tcatctcaga 8640 
aagccagggg accgaggagg acgagcggaa cctgagagcc ttcacggagg ctatgaccag 8700 
gtattctgcc cctcctggtg acccccccag accggagtat gatctggagc tgataacatc 8760 
ttgctcctca aatgtgtctg tggcgctggg cccacaaggc cgcicgcagat actacctgac 8820 
cagagaccct accactccaa tcgcccgggc tgcctgggaa acagttagac actcccctgt 8880 
caattcatgg ctgggaaaca tcatccagta ogccccgacc atatgggctc gcatggtcct -8940 
gatgacacac ttcttctcca ttctcatggc tcaagacacg ctggaccaga acctcaactt 9000 
tgagatgtac ggagcggtgt actccgtgag tcccttggac ctcccagcta taattgaaag 9060 
gttacatggg cttgacgctt tttctctgca cacatacact ccccacgaac tgacacgggt 9120 
ggcttcagcc ctcagaaaac ttggggcgcc acccctcaga gcgtggaaga gccgggcacg 9180 
tgcagtcagg gcgtccctca tctcccgtgg ggggagagcg gccgtfetgcg gtcgatatct 9240 
cttcaattgg gcggtgaaga ccaagctcaa actcactcca ttgccggaag cgegcctcct 9300 
ggatttatcc agctggttca ccgtcggcgc cggcgggggc gacatttatc acagcgtgtc 9360 
gcgtgcccga ccccgcttat tgctotttgg cctactccta ctttttgtag gggtaggcct 9420 
tttcctactc cccgctcggt agagcggcac acattagcta cactccatag ctaactgtcc 9480 
cttttttttt tttttttttt tttttttttt tttttttttt tttttttttt tttttttttt 9540 
tttttttttt tttttttttt tttttctttt tttctctttt ccttctttct taccttattt 9600 
tactttcttt cctggtggct ccatcttagc cctagtcacg gctagctgtg aaaggtccgt 9660 
gagccgcatg actgcagaga gtgccgtaac tggtctctct gcagatcatg t 9711 
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<210> 2 
<2H> 3033 
<212> PRT 

<213> Hepatitis C virus 
<400> 2 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro GXn Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Axg Arg Gin Pro 
50 55 60 

He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 

145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 
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Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala "Ser Gin Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 283 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser lie Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 3X5 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He He Ser Gly Ala His 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val He Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 330 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Qln Lys He Gin Leu Val Asn Thr 
405 410 

Asn Qly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Qly Phe He Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser He Glu Ala 
450 455 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
4SS 470 475 480 
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Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Qly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 , 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Qly Pro Trp Leu Thr 
595 600 605 , 

Pro Arg Cys Leu He Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 615 620 

Thr Val Asn Tyr Thr He Phe Lys He Arg Met Tyr Val Gly Gly Val 
625 630 63.5 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala lie Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 . 715 720 

Qlu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 
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Ala Cys Leu Trp Met Leu lie Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Lys Leu Val lie Leu His Ala Ala Ser Ala Ala Ser Cys Asn Gly 
755 760 765 

Phe Leu Tyr Phe Val lie Phe Phe Val Ala Ala Trp Tyr He Lys Gly 
770 775 780 

Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr Gly Leu Trp Ser Phe 
■^85 . 790 795 800 

Ser Leu Leu Leu Leu Ala Leu Pro Gin Gin Ala Tyr Ala Tyr Asp Ala 
805 810 815 

Ser Val His Gly Gin He Gly Ala Ala Leu Leu Val Met He Thr Leu 
920 825 830 

Phe Thr Leu Thr Pro Gly Tyr Lys Thr Leu Leu Ser Arg Phe Leu Trp 
835 840 845 

Trp Leu Cys Tyr Leu Leu Thr Leu Gly Glu Ala Met Val Qln Glu Trp 
850 855 860 

Ala Pro Pro Met Gin Val Arg Gly Gly Arg Asp Gly lie He Trp Ala 

870 875 880 

Val Ala He Phe Tyr Pro Gly Val Val Phe Asp He Thr Lys Trp Leu 
885 890 895 

Leu Ala Val Leu Gly Pro Ala Tyr Leu Leu Lys Gly Ala Leu Thr Arg 
900 905 910 

Val Pro Tyr Phe Val Arg Ala His Ala Leu Leu Arg Met Cys Thr Met 
915 920 925 

Ala Arg His Leu Ala Gly Gly Arg Tyr Val Gin Met Ala Leu Leu Ala 
930 935 940 

Leu Gly Arg Trp Thr Gly Thr Tyr He Tyr Asp His Leu Thr Pro Met 
945 950 955 960 

Ser Asp Trp Ala Ala Ser Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro He He Phe Ser Pro Met Glu Lys Lys Val He Val Trp Gly Ala 
980 985 990 
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Glu Thr Ala Ala Cys Gly Asp lie Leu His Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Leu Gly Arg Glu Val Leu Leu Gly Pro Ala Asp Gly Tyr Thr Ser 
1010 1015 1020 

Lys Gly Trp Ser Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Thr He Val Val Ser Met Thr Gly Arg Asp Lys 
1045 1050 1055 

Thr Glu Gin Ala Gly Glu He Gin Val Leu Ser Thr Val Thr Gin Ser 
1060 - 1065 1070 

Phe Leu Gly Thr Ser He Ser Gly Val Leu Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Asn Lys Thr Leu Ala Gly Ser Arg Gly Pro Val Thr Gin Met 
1090 1095 1100 

Tyr Ser Ser Ala Glu Gly Asp Leu Val Gly Trp Pro Ser Pro Pro Gly 
1105 1110 1115 1120 

Thr Lys Ser Leu Glu Pro Cys Thr Cys Gly Ala Val Asp Leu Tyr Leu 
1125 1130 1135 

Val Thr Arg Asn Ala Asp Val He Pro Ala Arg Arg Arg Gly Asp Lys 
1140 1145 1150 

Arg Gly Ala Leu Leu Ser Pro Arg Pro Leu Ser Thr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Val Leu Cys Pro Arg Gly His Ala Val Gly Val Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Ser Arg Gly Val Ala Lys Ser He Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Thr Leu Asp He Val Thr Arg Ser Pro Thr Phe Ser Asp 
1205 1210 1215 

Asn Ser Thr Pro Pro Ala Val Pro Gin Thr Tyr Gin Val Gly Tyr Leu 
1220 1225 1230"' 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Val Ala Tyr 
1235 1240 1245 
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Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Leu Ser Lys Ala His Gly lie Asn Pro 
1265 1270 1275 1280 

Asn- lie Arg Thr Gly Val Arg Thr Val Thr Thr Gly Ala Pro lie Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ala Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He lie Cys Asp Glu Cys His Ala Val Asp Ser Thr 
1315 1320 1325 

Thr He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Val Arg Leu Thr Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Thr Pro His Pro Asn He Glu Glu Val Ala Leu Gly Gin Glu Gly Glu 
1365 1370 1375 

He Pro Phe Tyr Gly Arg Ala He Pro Leu Ser Tyr He Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Ala Leu Arg Gly Met Gly Leu Asn Ser Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Gin Gly Asp Val Val Val Val Ala 
1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Val Ala val Thr Gin Val Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Thr Thr Gin He Val Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Ser Gin Arg Arg Gly Arg Thr Gly Arg Gly Arg Leu Gly He Tyr Arg 
1490 1495 1500 
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Tyr Val Ser Thr Gly Glu Arg Ala Ser Gly Met Phe Asp Ser Val Val 
150S 1510 1515 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Ala Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ser Glu Thr Thr Val Arg Leu Arg Ala Tyr Phe Asn Thr Pro Gly Leu 
1540 1545 1550 



Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Ala Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Ala Tyr Leu Thr Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 1600 

Ala Lys Ala Pro Pro Pro Ser Trp Asp Val Met Trp Lys Cys Leu Thr 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu Val Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
1620 1625 1630 

Gly Ser Val Thr Asn Glu Val Thr Leu Thr His Pro Val Thr Lys Tyr 
1635 1640 1645 

lie Ala Thr Cys Met Gin Ala Asp Leu Glu Val Met Thr Ser Thr Trp 
1650 1655 1660 



Val Leu Ala Gly Gly Val Leu Ala Ala Val Ala Ala Tyr Cys Leu Ala 
1665 1670 1675 1680 

Thr Gly Cys Val Cys He He Gly Arg Leu His He Asn Gin Arg Ala 
1685 1690 1695 



Val Val Ala Pro Asp Lys Glu Val Leu Tyr Glu Ala Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ala Ser Arg Ala Ala Leu He Glu Glu Gly Gin Arg He 
1715 1720 1725 



Ala Glu Met Leu Lys Ser Lys He Gin Gly Leu Leu Gin Gin Ala Ser 
1730 1735 1740 

Lys Gin Ala Gin Asp He Gin Pro Thr Val Gin Ala Ser Trp Pro Lys 
1745 1750 1755 1760 
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Val Glu Gin Phe Trp Ala Lys His Met Trp Asn Phe lie Ser Gly lie 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala Val Ala 
1780 1785 1790 

Ser Met Met Ala Phe Ser Ala Ala Leu Thr Ser Pro Leu Ser Thr Ser 
1795 1800 1805 

Thr Thr He Leu Leu Asn He Leu Gly Gly Trp Leu Ala Ser Gin He 
1810 1815 1820 

Ala Pro Pro Ala Gly Ala Thr Gly Phe Val Val Ser Gly Leu Val Gly 
1825 1830 1835 1840 

Ala Ala Val Gly Ser He Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly He Ser Gly Ala Leu Val Ala Phe Lys He 
I860 18S5 1870 

Met Ser Gly Glu Lys Pro Ser Met Glu Asp Val Val Asn Leu Leu Pro 
1875 1880 1885 

Gly He Leu Ser Pro Gly Ala Leu Val Val Gly Val He Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ala Pro Thr 
1925 1930 1935 

His Tyr Val Thr Glu Ser Asp Ala Ser Gin Arg Val Thr Gin Leu Leu 
1940, 1945 1950 

Gly Ser Leu Thr He Thr Ser Leu Leu Arg Arg Leu His Asn Trp He 
1955 I960 1965 

Thr Glu Asp Cys Pro He Pro Cys Gly Gly Ser Trp Leu Arg Asp Val 
1970 1975 1980 

Trp Asp Trp Val Cys Thr He Leu Thr Asp Phe Lys Asn Trp Leu Thr 
1985 1990 1995 JOOO 

Ser Lys Leu Phe Pro Lys Met Pro Gly Leu Pro Phe Val Ser Cys Gin 
2005 2010 2015 
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Lya Gly Tyr Lys Gly Val Trp Ala Gly Thr Gly He Met Thr Thr Arg 
2020 2025 2030 

Cys Pro Cys Gly Ala Asn He Ser Gly Asn Val Arg Leu Gly Ser Met 
2035 2040 2045 

Arg He Thr Oly Pro Lys Thr Cys Met Asn He Trp Gin Gly Thr Phe 
2050 2055 2060 

Pro He Asn Cys Tyr Thr Glu Gly Gin Cys Val Pro Lys Pro Ala Pro 
2065 2070 2075 2080 

Asn Phe Lys Val Ala He Trp Arg Val Ala Ala Ser Glu Tyr Ala Glu 
2085 2090 2095 

Val Thr Gin His Gly Ser Tyr His Tyr He Thr Gly Leu Thr Thr Asp 
2100 2105 2X10 

, Asn Leu Lys Val Pro Cys Gin Leu Pro Ser Pro Glu Phe Phe Ser Trp 
2115 2120 2125 

Val Asp Gly Val Gin He His Arg Phe Ala Pro Thr Pro Lys Pro Phe 
2130 2135 2140 * 

Phe Arg Asp Glu Val Ser Phe Cys Val Gly Leu Asn Ser Phe Val Val 
2145 2150 2155 21S0 

Gly Ser Gin Leu Pro Cys Asp Pro Glu Pro Asp Thr Asp Val Leu ket 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Thr Ala Ala Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Glu Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Arg Ala Thr Cys Thr Thr His Gly Lys 
2210 2215 2220 

Ala Tyr Asp Val Asp Met Val Asp Ala Asn Leu Phe Met Gly Gly Asp 
2225 2230 2235 2240 

Val Thr Arg He Glu Ser Gly Ser Lys Val Val Val Leu Asp Ser Leu 
2245 2250 2255 

Asp Pro Met Val Glu Glu Arg Ser Asp Leu Glu Pro Ser He Pro Ser 
2260 2265 2270 
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Glu Tyr Met Leu Pro Lys Lys Arg Phe Pro Pro Ala Leu Pro Ala Trp 
2275 2280 2285 

Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Ser Trp Lys Arg Pro 
2290 ' 2295 2300 

Asp Tyr Gin Pro Ala Thr Val Ala Gly Cys Ala Leu Pro Pro Pro Arg 
2305 2310 2315 2320 

Lys Thr Pro Thr Pro Pro Pro Arg Arg Arg Arg Thr Val Gly Leu Ser 
2325 2330 2335 

Glu Asp Ser He Gly Asp Ala Leu Gin Gin Leu Ala He Lys Ser Phe 
2340 2345 2350 

Gly Gin Pro Pro Pro Ser Gly Asp Ser Gly Leu Ser Thr Gly Ala Gly 
2355 2360 2365 

Ala Ala Asp Ser Gly Ser Gin Thr Pro Pro Asp Glu Leu Ala Leu Ser 
2370 2375 2380 

Glu Thr Gly Ser He Ser Ser Met Pro Pro Leu Glu Gly Glu Leu Gly 
2385 2390 2395 2400 

Asp Pro Asp Leu Glu Pro Glu Gin Val Glu Pro Gin Pro Pro Pro Gin 
2405 2410 2415 

Gly Gly Val Ala Ala Pro Gly Ser Asp Ser Gly Ser Trp Ser Thr Cys 
2420 2425 2430 

Ser Glu Glu Asp Asp Ser Val Val Cys Cys Ser Met Ser Tyr Ser Trp 
2435 2440 2445 

Thr Gly Ala Leu He Thr Pro Cys Ser Pro Glu Glu Glu Lys Leu Pro 
2450 2455 2460 

He Asn Pro Leu Ser Asn Ser Leu Leu Arg Tyr His Asn Lys Val Tyr 
2465 2470 2475 2480 

Cys Thr Thr Thr Lys Ser Ala Ser Leu Arg Ala Lys Lys Val Thr Phe 
2485 2490 2495 

Asp Arg Met Gin Val Leu Asp Ser Tyr Tyr Asp Ser Val Leu Lys Asp 
2500 2505 2510 

He Lys Leu Ala Ala Ser Lys Val Thr Ala Arg Leu Leu Thr Met Glu 
2515 2520 2525 
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Glu Ala Cys Gin Leu Thr Pro Pro His Ser Ala Arg Ser Lys Tyr Gly 
2530 2535 2540 

Phe Gly Ala Lys Glu Val Arg Ser Leu Ser Gly Arg Ala Val Asn His 
2545 2550 2555 2560 

lie Lys Ser Val Trp Lys Asp- Leu Leu Glu Asp Ser Glu Thr Pro He 
2565 2570 2575 

Pro Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Asp Pro Thr 
2580 2585 2590 

Lys Gly Gly Lys Lys Ala Ala Arg Leu lie Val Tyr Pro Asp Leu Gly 
2595 2600 2605 

Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp lie Thr Gin Lys Leu 
2610 2615 2620 

Pro Gin Ala Val Met Gly Ala Ser Tyr Gly Phe Gin Tyr Ser Pro Ala 
2625 2630 2635 2640 

Gin Arg Val Glu Phe Leu Leu Lys Ala Trp Ala Glu Lys Lys Asp Pro 
2645 2650 2655 

Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu 
2660 2665 2670 

Arg Asp He Arg Thr Glu Glu Ser He Tyr Arg Ala Cys Ser Leu Pro 
2675 2680 2685 

Glu Glu Ala His Thr Ala He His Ser Leu Thr Glu Arg Leu Tyr Val 
2690 2695 2700 

Gly Gly Pro Met Phe Asn Ser Lys Gly Gin Thr Cys Gly Tyr Arg Arg 
2705 2710 2715 2720 

Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Met Gly Asn Thr He Thr 
2725 2730 2735 

Cys Tyr Val Lys Ala Leu Ala Ala Cys Lys Ala Ala Gly He He Ala 
2740 2745 2750 

Pro Thr Met Leu Val Cys Gly Asp Asp Leu Val Val He Ser Glu Ser 
2755 2760 2765 

Gin Gly Thr Glu Glu Asp Glu Arg Asn Leu Arg Ala Phe Thr Glu Ala 
2770 2775 2780 
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Met Thr Arg Tyr Ser Ala Pro Pro aiy Asp Pro Pro Arg Pro Glu Tyr 
278S 2790 2795 2800 

Asp Leu Glu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala Leu 
2805 2810 2815 

Oly Pro Gin Gly Arg Arg Arg Tyr Tyr Leu Thr Arg Asp Pro Thr Thr 
2820 2825 2330 

Pro He Ala Arg Ala Ala Trp Glu Thr Val Arg His Ser Pro Val Asn 
2835 2840 2845 

Ser Trp Leu Gly Asn He He Gin Tyr Ala Pro Thr He Trp Ala Arg 
2850 2855 2860 

Met Val Leu Met Thr His Phe Phe Ser He Leu Met Ala Gin Asp Thr 
2865 2870 2875 2880 

Leu Asp Gin Asn Leu Asn Phe Glu Met Tyr Gly Ala Val Tyr Ser Val 
2885 2890 2895 

Ser Pro Leu Asp Leu Pro Ala He He Glu Arg Leu His Gly Leu Asp 
2900 2905 2910 

Ala Phe Ser Leu His Thr Tyr Thr Pro His Glu Leu Thr Arg Val Ala 
2915 2920 2925 

Ser Ala Leu Arg Lys Leu Gly Ala Pro Pro Leu Arg Ala Trp Lys Ser 
2930 2935 2940 

Arg Ala Arg Ala Val Arg Ala Ser Leu He Ser Arg Gly Gly Arg Ala 
2945 2950 2955 2960 

Ala Val Cys Gly Arg Tyr Leu Phe Asn Trp Ala Val Lys Thr Lys Leu 
2965 2970 2975 

Lys Leu Thr Pro Leu Pro Glu Ala Arg Leu Leu Asp Leu Ser Ser Trp 
2980 2985 2990 

Phe The Val Gly Ala Gly Gly Gly Asp He Tyr His Ser Val Ser Arg 
2995 3000 3005 

Ala Arg Pro Arg Leu Leu Leu Phe Gly Leu Leu Leu Leu Phe Val Gly 
3010 3015 • 3020 

Val Gly Leu Phe Leu Leu Pro Ala Arg 
3025 3030 
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<210> 3 

<211> 96X1 

<2i2> DNA 

<213> Hepatitis C virus 



<400> 3 

gccagccccc tgatgggggc gacactccac catgaatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgCcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gataaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcaagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcaca aatcctaaac 360 

ctcaaagaaa aaccaaaaga aacaccaacc gtcgcccaca agacgttaag tttccgggcg 420 

gcggccagat cgttggcgga gtatacttgt tgccgcgcag gggccccagg ttgggtgtgc 480 

gcgcgacaag gaagacttcg gagcggtccc agccacgtgg aaggcgccag cccatcccta 540 

aagatcggcg ctccactggc aaatcctggg gaaaaccagg atacccctgg cccctatacg 600 

ggaatgaggg actcggctgg gcaggatggc toctgtcccc ccgaggttcc cgtccctctt 660 

ggggccccaa tgacccccgg cataggtcgc gcaacgtggg taaggtcatc gataccctaa 720 

cgtgcggctt tgcegaectc atggggtaca tccctgtcgt gggcgccccg ctcggcggcg 780 

tcgccagagc tctcgcgcat ggcgtgagag tcctggagga cggggttaat tttgcaacag 840. 

ggaacttacc . cggttgctcc ttttctatct tcttgctggc octgctgtcc tgcatcacca 900 ,i 

ccccggtctc cgctgccgaa gtgaagaaca tcagtaccgg ctacatggtg actaacgact 960 

gcaccaatga cagcattacc tggcagctcc aggctgctgt cctccacgtc cccgggtgcg 1020 

tcccgtgcga gaaagtgggg aatgcatctc agtgctggat accggtctca ccgaatgtgg 1080 

ccgtgcagcg gcccggcgcc ctcacgcagg gcttgcggac goacatogac atggttgtga 1140 

tgtccgccac gctctgctct gccctctacg tgggggacct ctgcggtggg gtgatgctcg 1200 

cagcccaaat gttcattgtc tcgccgcagc accactggtt tgtccaagac tgcaattgct 1260 

ccatctaccc tggtaccatc actggacacc gcatggcatg ggacatgatg atgaactggt 1320 

cgccoacggc taccatgatc ttggcgtacg cgatgcgtgc ccccgaggtc attatagaca 1380 

tcattagcgg ggctcattgg ggcgtcatgt tcggcttggc ctacttctct atgcagggag 1440 

cgtgggcgaa agtcgttgtc atccttctgt tggccgccgg ggtggacgcg cgcacccata 1500 

ctgttggggg ttctgccgcg cagaccaccg ggcgcctcac cagcttattt gacatgggcc 1560 

ccaggcagaa aatccagctc gttaacacca atggcagctg gcacatcaac cgcaccgccc 1620 

tgaactgcaa tgactccttg cacaccggct ttatcgcgtc tctgttctac acccacagct 1680 

tcaactcgtc aggatgtccc gaacgcatgt ccgcctgccg cagtatcgag gccttccggg 1740 

tgggatgggg cgccttgcaa tatgaggata atgtcaccaa tccagaggat atgagaccct 1800 

attgctggca ctacccacca aggcagtgtg gcgtggtctc cgcgaagact gtgtgtggcc 1860 

cagtgtactg tttcaccccc agcccagtgg tagtgggcac gaccgacagg cttggagcgc 1920 

ccacttacac gtggggggag aatgagacag atgtcttcct attgaacagc actcgaccac 1980^ 

cgctggggtc atggttcggc tgcacgtgga tgaactcttc tggctacacc aagacttgcg 204O 

gcgcaccacc ctgccgtact agagctgact tcaacgccag cacggacctg ttgtgcccca 2100 

cggactgttt taggaagcat cctgatacca cttacctcaa atgcggctct gggccctggc 2160 

tcacgccaag gtgcctgatc gactacocct acaggctctg gcattacccc tgcacagtta 2220 

actataccat cttcaaaata aggatgtatg tgggaggggt tgagcacagg ctcacggctg 2280 

catgcaattt cactcgtggg gatcgttgca acttggagga cagagacaga agtcaactgt 2340 

ctcctttgtt gcactccacc acggaatggg ccattttacc ttgctcttac tcggacctgc 2400 

ccgccttgtc gactggtctt ctccacctcc accaaaacat cgtggacgta caattcatgt 2460 
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atggcctatc acctgccctc 
tcctgctctt agcggacgcc 
aggccgaagc agcactagag 
atggcttcct atattttgtc 
tccccttagc tacctattcc 
tgccccaaca ggcatatgca 
ttgtcgggtt aatggcgctg 
tgtggtggct tcagtatttt 
ccctcaacgt ccgggggggg 
ccctggtatt tgacatcacc 
aagccagttt gcttaaagtc 
cgctagcgcg gaagatagcc 
cgcttactgg cacctatgtg 
gcctgcgaga tctggccgtg 
tcatcacgtg gggggcagat 
ctgcccgtag gggccaggag 
ggaggttgct ggcgcccatc 
taatcaccag cctgactggc 
caactgctac ccaaaccttc 
acggggccgg aacgaggacc 
atgtggacca agaccttgtg 
gtacctgcgg ctcctcggac 
gccggcgagg tgatagcagg 
gctcctcggg gggtccgctg 
cggtgtgcac ccgtggagtg 
caaccatgag atccccggtg 
tccaggtggc ccacctgcat 
cgtacgcagc ccagggctac 
gctttggtgc ttacatgtcc 
gaacaattac cactggcagc 
gcgggtgctc aggaggtgct 
ccacatccat cttgggcatc 
tggttgtgct cgccactgct 
aggaggttgc tctgtccacc 
aggtgatcaa ggggggaaga 
tcgccgcgaa gctggtcgca 
tgtctgtcat cccgaccagc 
gctttaccgg cgacttcgac 
atttcagcct tgaccctacc 
ccaggactca acgccggggc 
caccggggga gcgcccctcc 
cgggctgtgc ttggtatgag 
tgaacacccc ggggcttccc 
cgggcctcac tcatatagat 
ttccttacct ggtagcgtac 
cgtgggacca gatgtggaag 
ccctgctata cagactgggc 
aatacatcat gacatgcatg 



acaaaataca tcgtccgatg 
agggtttgcg cctgcttatg 
aagctggtca tcttgcacgc 
atctttttcg tggctgcttg 
ctcactggcc tgtggtcctt 
ctggacacgg aggtggccgc 
actctgtcgc catattacaa 
ctgaccagag tagaagcgca 
cgcgatgccg tcatcttact 
aaactactcc tggccatctt 
ccctacttcg tgcgcgttca 
ggaggtcatt acgtgcaaat 
tataaccatc tcacccctct 
gctgtggaac cagtcgtctt 
accgccgcgt gcggtgacat 
atactgcttg ggccagccga 
acggcgtacg cccagcagac 
cgggacaaaa accaagtgga 
ctggcaacgt gcatcaatgg 
atcgcatcac ccaagggtcc 
ggctggcccg ctoctcaagg 
ctttacctgg tcacgaggca 
ggtagcctgc tttcgccccg 
ttgtgccccg cgggacacgc 
gctaaagcgg tggactttat 
ttcacggaca actcctctcc 
gctcccaccg gcagcggtaa 
aaggtgttgg tgctcaaccc 
aaggcccatg gggttgatcc 
cccatcacgt actccaccta 
tatgacataa taatttgtga 
ggcactgtcc ttgaccaagc 
acccctccgg gctccgtcac 
accggagaga tcccctttta 
catctcatct tctgccactc 
ttgggcatca atgccgtggc 
ggcgatgttg tcgtcgtgtc 
tctgCgatag actgcaacac 
tttaccattg agacaaccac 
aggactggca gggggaagcc 
ggcatgttcg actcgtccgt 
ctcacgcccg ccgagactac 
gtgtgccagg accatcttga 
gcccactttt tatcccagac 
caagccaccg tgtgcgctag 
tgtttgatcc gccttaaacc 
gctgttcaga atgaagtcac 
tcggccgacc tggaggtcgt 
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ttggcggcgt cctggctgct ctggccgcgt attgcctgto aacaggctgc gtggtcatag 5400 
tgggcaggat cgtcttgtcc gggaagocgg caattatacc tgacagggag gttctctacc 5460 
aggagctcga tgagatggaa gagtgctctc agcacttaec gtacatcgag caagggatga 5520 
tgctcgctga gcagttcaag cagaaggccc tcggcctcct gcagaccgcg tcccgccatg 5580 
cagaggttat cacccctgct gtccagacca actggcagaa actcgaggtc ttttgggcga 5640 
agcacatgtg gaatttcatc agtgggatac aatacttggc gggcctgtca acgctgcctg 5700 
gtaacGCcgc cattgcttca ttgatggctt ttacagctgc cgtcaccagc ccactaacca 5760 
ctggccaaac cctcctcttc aacatattgg gggggtgggt ggctgcccag ctcgccgcec 5820 
ccggtgccgc tactgccttt gtgggtgctg gcctagctgg cgccgccatc ggcagcgttg 5880 
gactggggaa ggtcctcgtg gacattcttg cagggtatgg cgcgggcgtg gcgggagctc 5940 
ttgtagcatt caagatcatg agcggtgagg tcccctccac ggaggacctg gtcaatctgc 6000 
tgcccgccat cctctcgcct ggagcccttg tagtcggtgt ggtctgcgca gcaataotgc 6060 
gccggcacgt tggcccgggc gagggggcag tgcaatggat gaaocggcta atagccttcg 6120 
cctcccgggg gaaccatgtt tcccccacgc actacgtgcc ggagagcgat goagccgccc 6180 
gcgtcactgc catactcagc agcctcactg taacccagct cctgaggcga ctgcatcagt 6240 
ggataagctc ggagtgtacc actccatgct ccggttcctg gctaagggac atctgggact 6300 
ggatatgcga ggtgctgagc gactttaaga cctggctgaa agcoaagctc atgccacaac 6360 
tgcotgggat tccctttgtg tcctgecagc gcgggtatag gggggtctgg cgaggagacg 6420 
gcattatgca cactcgctgc cactgtggag ctgagatcac tggaoatgtc aaaaacggga 6480 
cgatgaggat cgtcggtcct aggacctgca ggaacatgtg gagtgggacg ttccccatta 6540 
acgcctacac cacgggcccc tgtactcccc ttcctgcgcc gaactataag ttcgcgctgt .6600 
ggagggtgtc tgcagaggaa tacgtggaga taaggcgggt gggggacttc cactaogtat 6660 
cgggtatgac tactgacaat cttaaatgcc cgtgccagat cccatcgccc gaatttttca 6720 
cagaattgga cggggtgcgc ctacacaggt ttgcgccccc ttgcaagccc ttgctgcggg 6780 
aggaggtatc attcagagta ggactccacg agtacccggt ggggtcgcaa ttaccttgcg 6840 
agcccgaacc ggacgtagcc gtgttgacgt ccatgctcac tgatccctcc catataacag 6900 
cagaggcggc cgggagaagg ttggcgagag ggtcaccccc ttctatggcc agctcctcgg 6960 
ctagccagct gtccgctcca tctctcaagg caaottgcac egooaaccat gactcccotg 7020 
acgccgagct catagaggct aacctcctgt ggaggcagga gatgggcggc aacatcacca 7080 
gggttgagtc agagaacaaa gtggtgattc tggactcctt cgatccgctt gtggcagagg 7140 
aggatgagcg ggaggtctcc gtacctgcag aaattctgcg gaagtctcgg agattcgccc 7200 
gggccctgcc cgtctgggcg cggccggact acaacccccc gctagtagag acgtggaaaa 7260 
agcctgacta cgaaccacct gtggtccatg gctgcccgct acoaccteca oggtcocctc 7320 
ctgtgcctcc gcctcggaaa aagcgtacgg tggtcctcac cgaatcaacc ctatctactg 7380 
ccttggccga gcttgccacc aaaagttttg gcagctcctc aacttccggc attacgggcg 7440 
acaatacgac aacatcctct gagcccgccc cttctggctg cccccccgac tccgacgttg 7500 
agtcctattc ttccatgccc cccctggagg gggagcctgg ggatccggat ctcagcgacg 7560 
ggtcatggtc gacggtcagt agtggggccg acacggaaga tgtcgtgtgc tgctcaatgt 7620 • 
cttattcctg gacaggcgca ctcgtcaccc cgtgcgctgc ggaagaacaa aaactgccca 7680 
tcaacgoact gagcaactcg ttgctacgcc atcacaatct ggtgtattcc accacttcac 7740 
gcagtgcttg ccaaaggcag aagaaagtca catttgacag actgcaagtt ctggacagcc 7800 
attaccagga cgtgctcaag gaggtcaaag cagcggcgtc aaaagtgaag gctaacttgc 7860 
tatccgtaga ggaagctCgc agcctgacgc ccccacattc agccaaatcc aagtttggct 7920 
atggggcaaa agacgtccgt tgccatgcca gaaaggccgt agcccacatc aactccgtgt 7980 
ggaaagacct tctggaagac agtgtaacac caatagacac taccatcatg gccaagaacg 8040 
aggttttctg cgttcagcct gagaaggggg gtcgtaagcc agctcgtctc aCcgtgttcc 8100 
ccgacctggg cgtgcgcgtg tgcgagaaga tggccctgta cgacgtggtt agcaagctcc 8160 
ccctggccgt gatgggaagc tcctacggat tccaatactc accaggacag cgggttgaat 8220 
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tcctcgtgca agcgtggaag tccaagaaga 
gttttgactc cacagtcact gagagcgaca 
gtgacctgga cccccaagcc cgcgtggcca 
ggggccctct taccaattca aggggggaaa 
gcgtactgac aactagctgt ggtaacaccc 
gtcgagccgc agggctccag gactgcacca 
tctgtgaaag tgcgggggtc caggaggacg 
tgaccaggta ctccgccccc cccggggacc 
taacatcatg ctcctccaac gtgtcagtcg 
accttacccg tgacectaca acccccctcg 
ctccagtcaa ttcctggcta ggcaaoataa 
tgatactgat gacccatttc tttagcgtcc 
ttaactgtga gatctacgga gcctgctact 
ttcaaagact ccatggcctc agcgcatttt 
atagggtggc cgcatgcctc agaaaacttg 
gggcccggag cgtccgcgct aggcttctgt 
agtacctctt caactgggca gtaagaacaa 
gccggctgga cttgtccggt tggttcacgg 
gcgtgtctca tgcccggccc cgctggttct 
taggcatcta cctcctcccc aaccgatgaa 
catttcctgt tttttttttt tttttttttt 
ttcttttttt cctttctttt tcccttcttt 
gctagctgtg aaaggtccgt gagccgcatg 
gcagatcatg t 



ccccgatggg gttctcgcat gatacccgct 8280 
tccgtacgga ggaggcaatt taccaatgtt 8340 
tcaagtccct cactgagagg ctttatgttg 8400 
actgcggcta ccgcaggtgc cgcgcgagcg 8460 
tcacttgcta catcaaggcc cgggcagcct 8520 
tgctcgtgtg tggcgacgac ttagtcgtta 8580 
cggcgagcct gagagccttc acggaggcta 8640 
ccccacaacc agaatacgac ttggagctta 8700 
cocacgacgg cgctggaaag agggtctact 8760 
cgagagccgc gtgggagaca gcaagacaca 8820 
tcatgtttgc ccccacactg tgggcgagga 8880 
tcatagccag ggatcagctt gaacaggctc 8940 
ccatagaacc actggatcta cctccaatca 9000 
cactccacag ttactctcca ggtgaaatca 9060 
gggtcccgcc cttgcgagct tggagacacc 9120 
ccagaggagg cagggctgct atatgtggca 9180 
agctcaaact cactccaata gcggccgctg 9240 
ctggctacag cgggggagac atttatcaca 9300 
ggttttgcct actcctgctc gctgcagggg 9360 
ggttggggta aacactcegg cctcttaagc 9420 
tttttttctt tttttttttc tttcctttcc 9480 
aatggtggct ccatcttagc cctagtcacg 9540 
actgcagaga gtgctgatac tggcctctct 9600 

9611 



<210> 4 
<211> 3015 
<212> PRT 

<213> Hepatitis C virus 
<400> 4 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
S5 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 



20 



BNSCXXID; <WO 007533SA2_L> 



wo 00/75338 PCT/USOO/15446 
35 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 

100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu Thr Cys 
lis 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Qlu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr - 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 

Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He He Ser Gly Ala His 

21 



BNSDOOID; <W0 0075333A2J_> 



wo 00/75338 PCTAJSOO/15446 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys lie Gin Leu Val Asn Thr 
405 4X0 415 

Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe He Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser He Glu Ala 
450 455 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 ,570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
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595 600 €05 

Pro Arg Cys Leu lie Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 6X5 620 

Thr Val Asn Tyr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala lie Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn lie Val Asp Val Gin 
690 695' 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr lie Val Arg Trp : 
705 710 715 720 

Glu Trp Val lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 

Ala Cys Leu Trp Met Leu lie Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Lys Leu Val lie Leu His Ala Ala Ser Ala Ala Ser Cys Asn Gly 
755 760 765 

Phe Leu Tyr Phe Val lie Phe Phe Val Ala Ala Trp Tyr He Lys Gly 
770 775 780 

Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr Gly Leu Trp Ser Phe 
785 790 795 800 

Ser Leu Leu Leu Leu Ala Leu Pro Gin Gin Ala Tyr Ala Leu Asp Thr 
805 810 815 

Glu Val Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp 
835 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
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B50 855 860 

Val Pro Pro Leu Aan Val Arg Qly Qly Arg Asp Ala Val lie Leu Leu 
865 870 875 830 

Met Cys Val Val His Pro Thr Leu Val Phe Asp lie Thr Lys Leu Leu 
S85 890 895 

Leu Ala lie Phe Gly Pro Leu Trp He Leu Qln Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe Val Arg Val Gin Gly Leu Leu Arg lie Cys Ala Leu 
915 920 925 

Ala Arg Lys lie Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val Val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 985 990 

Asp Thr Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Arg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Qly Met Val Ser 
1010 1015 1020 

Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 . 1100 

Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
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1105 1110 1115 1120 

Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1125 1130 1135 

Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He 
1185 1190 1195 1200 

Pro Val Qlu Asn Leu Gly Thr Thr Met Arg. Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu — 
1220 '1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 

Ala Ala Gin Gly Tyr Lys val Leu Val Leu Asn Pro Ser Val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro 
126S 1270 1275 1230 

Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Asn He Olu Qlu Val Ala Leu Ser Thr Thr Gly Glu 
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1365 1370 137S 

lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Glu Val lie Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 
1490 1495 1500 

Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 
1505 1510 1515 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His He Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1535 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
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1620 1625 1630 

aiy A.la Val Gin Asn Glu Val Thr Leu Thr His Pro lie Thr Lys Tyr 
1635 1640 1645 

lie Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
166S 1670 1675 1680 

Thr Gly Cys Val Val lie Val Gly Arg lie Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala lie He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 

Leu Glu Val Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala 
1780 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 1830 1835 1840 

Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
1860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
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1875 1880 1885 

Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 i960 1965 

Ser Ser Glu Cys Thr Thr Pro Cya Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 

Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg 
2020 2025 2030 

Cys His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 

Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
2085 2090 ■ 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu 
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2130 2135 2140 

Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2155 2170 2175 

Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys Val Val He r 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 

Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
. 2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 23S0 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn 
23S5 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
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2385 2390 2395 2400 

Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro lie Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His lie Asn 
2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr 
2545 2550 2555 2560 

Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp . 
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2645 2650 2655 

lie Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 2665 2670 

Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2590 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 

Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Tip Glu Thr Ala Arg His Thr Pro val Asn Ser Trp 
2820 2825 2830 

Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 
2B50 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys 
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2910 



Leu Arg Lys Leu Gly Val .Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala lie 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro lie Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

lie Tyr Leu Leu Pro Asn. Arg 
3010 3015 



<210> 5 
<211> 9S11 
<212> DNA 

<213> Hepatitis C virus 
<400> 5 

gccagccccc tgatgggggc gacactccac catgaatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagogtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccgg 180 
gaagactggg tcctttcttg gataaaccca ctotatgccc ggccatttgg gcgtgccccc 240 
gcaagactgc tagccgagta gcgttgggtt gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcaca aatcctaaac 360 
ctcaaagaaa aaccaaaaga aacaccaacc gtcgcccaca agacgttaag tttccgggcg 420 
gcggocagat cgttggcgga gtatacttgt tgccgcgcag gggccccagg ttgggtgtgc 4 80 
gcgcgacaag gaagacttcg gagcggtccc agccacgtgg aaggcgccag' cccatcccta 540 
aagatcggcg ctccactggc aaatcctggg gaaaaccagg atacccctgg cccctatacg 600 
ggaatgaggg actcggctgg gcaggatggc tcctgtcccc ccgaggttcc cgtccctctt 660 
ggggccccaa tgacccccgg cataggtcgc gcaacgtggg taaggtcatc gataccctaa 720 
cgtgcggctt tgccgacctc atggggtaca tccctgtcgt gggcgccccg ctcggcggcg 7 80 
tcgccagagc tctcgcgcat ggcgtgagag tcctggagga cggggttaat tttgcaacag 840 
ggaacttacc cggttgctcc ttttctatct tcttgctggc cctgctgtcc tgcatcacca 900 
ccccggtctc cgctgccgaa gtgaagaaca tcagtaccgg ctacatggtg actaacgact 960 
gcaccaatga cagcattacc tggcagctcc aggctgctgt cctccacgtc cccgggtgcg 1020 
tcccgtgcga gaaagtgggg aatgcatctc agtgctggat accggtctca ccgaatgtgg 1080 
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ccgtgcagog gcccggcgcc ctcacgcagg 
tgtccgccac gctctgctct gccctctacg 
cagcccaaat gttcattgtc tcgccgcagc 
ccatctaccc tggtaccatc actggacacc 
cgcccacggc taccatgatc ttggcgtacg 
tcattagcgg ggctcattgg ggcgtcatgt 
cgtgggcgaa agtcgttgtc atccttctgt 
ctgttggggg ttctgccgcg cagaccaccg 
ccaggcagaa aatccagctc gttaacacca 
tgaactgcaa tgactccttg cacaccggct 
tcaactcgtc aggatgtccc gaacgcatgt 
tgggatgggg cgccttgcaa tatgaggata 
attgctggca ctacccacca aggcagtgtg 
cagtgtactg tttcaccccc agcccagtgg 
ccacttacac gtggggggag aatgagacag 
cgctggggtc atggttcggc tgcacgtgga 
gcgcaccacc ctgccgtact agagctgact 
cggactgttt taggaagcat cctgatacca 
tcacgccaag gtgcctgatc gact&cccct 
actataccat cttcaaaata aggatgtatg 
catgcaattt cactcgtggg gatcgttgca 
ctcctttgtt gcactccacc acggaatggg 
ccgccttgtc gactggtctt ctccacctcc 
atggcctatc acctgccctc acaaaataca 
tcctgctctt agcggacgcc agggtttgcg 
aggccgaagc agctttggag aacctcgtaa 
acggtcttgt gtccttcctc gtgttcttct 
tgcccggagc ggtctacgcc ctctacggga 
tgcctcagcg ggcatatgca ctggacacgg 
ttgtcgggtt aatggcgctg actctgtcgc 
tgtggtggct tcagtatttt ctgaccagag 
ccctcaacgt ccgggggggg cgcgatgccg 
ccctggtatt tgacatcacc aaactactcc 
aagccagttt gcttaaagtc ccctacttcg 
cgctagcgcg gaagatagcc ggaggtcatt 
cgcttactgg cacctatgtg tataaccatc 
gcctgcgaga tctggccgtg gctgtggaac 
tcatcacgtg gggggcagat accgccgcgt 
ctgcccgtag gggccaggag atactgcttg 
ggaggttgct ggcgcecatc acggcgtacg 
taatcaccag cctgactggc cgggacaaaa 
caactgctac ccaaaccttc ctggcaacgt 
acggggccgg aacgaggacc atcgcatcac 
atgtggacca agaccttgtg ggctggcccg 
gtacctgcgg ctcctcggac ctttacctgg 
gccggcgagg tgatagcagg ggtagcctgc 
gctcctcggg gggtccgctg ttgtgccccg 
cggtgtgcac ccgtggagtg gctaaagcgg 
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gcttgcggac gcacatcgac atggttgtga H40 
tgggggacct ctgcggtggg gtgatgctcg 1200 
accactggtt tgtccaagac tgcaattgct 1260 
gcatggcatg ggacatgatg atgaactggt 1320 
cgatgcgtgt ccccgaggtc attatagaca 1380 
tcggcttggc ctacttctct atgcagggag 1440 
tggccgccgg ggtggacgcg cgcacccata 1500 
ggcgcctcac cagcttattt gacatgggcc 1550 
atggcagctg gcacatcaac cgcaccgccc 1S20 
ttatcgcgtc tctgttctac acccacagck 1680 
ccgcctgccg cagtatcgag gccttccggg 1740 
atgtcaccaa tccagaggat atgagaccct 1800 
gcgtggtctc cgcgaagact gtgtgtggcc 1860 
tagtgggcac gaccgacagg cttggagcgc 1920 
atgtcttcct attgaacagc actcgaccac 1980 
tgaactcttc tggctacacc aagacttgcg 2040 
tcaacgccag cacggacctg ttgtgcccca 2100 
cttacctcaa atgcggctct gggccctggc 2160 
acaggctctg gcattacccc tgcacagtta 2220 
tgggaggggt tgagcacagg ctcacggctg 2280 
acttggagga cagagacaga agtcaactgt 2340 
ccattttacc ttgctcttac tcggacctgc 240O 
accaaaacat cgtggacgta caattoatgt 2460 
tcgtccgatg ggagtgggta atactcttat 2520 
cctgcttatg gatgctcatc ttgttgggcc 2580 
tactcaatgc agcatccctg gccgggacgc 2640 
gctttgcgtg gtatctgaag ggtaggtggg 2700 
tgtggcctct cctcctgctc ctgctggcgt 2760 
aggtggccgc gtcgtgtggc ggcgttgttc 2820 
cacattacaa gcgctatatc agctggtgca 2880 
tagaagcgca actgcacgtg tgggttcccc 2940 
tcatcttact catgtgtgta gtacacccga 3000 
tggccatctt cggacccctt tggattcttc 3060 
tgcgcgttca aggccttcto cggatctgeg 3120 
acgtgcaaat ggccatcatc aagttagggg 3180 
tcacccctct tcgagactgg gcgcacaacg 3240 
cagtcgtctt ctcccgaatg gagaccaagc 3300 
gcggtgacat catcaacggc ttgcccgtct 3360 
ggccagccga cggaatggtc tccaaggggt 3420 
cccagcagac gagaggcctc ctagggtgta 34 30 
accaagtgga gggtgaggtc cagatcgtgt 3540 
gcatcaatgg ggtatgctgg actgtctacc 3600 
ccaagggtcc tgtcatccag abgtatacca 3660 
ctcctcaagg ttcccgctca ttgacaccct 3720 
tcacgaggca cgccgatgtc attcccgtgc 3780 
tttogccccg gcccatttcc tacttgaaag 3840 
cgggacacgc cgtgggccta ttcagggccg 3900 
tggactttat ccctgtggag aacctaggga 3960 
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caaccatgag atccccggtg ttcacggaca 
tccaggtggc ccacctgcat gctcccaccg 
cgtacgcagc ccagggctac aaggtgttgg 
gctttggtgc ttacatgtcc aaggcccatg 
gaacaattac eactggcagc cccatcacgt 
gcgggtgctc aggaggtgct tatgacataa 
ccacatccat cttgggcatc ggcactgtcc 
tggttgtgct cgccactgct acccctccgg 
aggaggttgc tctgtccacc accggagaga 
aggtgatcaa ggggggaaga catctcatct 
tcgccgcgaa gctggtcgca ttgggcatca 
tgtctgtcat cccgaccagc ggcgatgttg 
gctttaccgg cgacttcgac tctgtgatag 
atttcagcct tgaccctacc tttaccattg 
ccaggactca acgccggggc aggactggca 
caccggggga gcgcccctcc ggcatgttcg 
cgggctgtgc ttggtatgag ctcacgcccg 
tgaacacccc ggggcttccc gtgtgccagg 
cgggcctcac tcatatagat gcccactttt 
ttccttacct ggtagcgtac caagccaccg 
cgtgggacca gatgtggaag tgtttgatcc 
ccctgctata cagactgggc gctgttcaga 
aatacatcat gacatgcatg tcggccgacc 
ttggcggcgt cctggctgct ctggccgcgt 
tgggcaggat cgtcttgtcc gggaagccgg 
aggagttcga tgagatggaa gagtgctctc 
tgctcgctga gcagttcaag cagaaggccc 
cagaggttat cacccctgct gtccagacca 
agcacatgtg gaatttcatc agtgggatac 
gtaaccccgc cattgcttca ttgatggctt 
ctggccaaac cctcctcttc aacatattgg 
ccggtgccgc tactgccttt gtgggtgctg 
gactggggaa ggtcctcgtg gacattcttg 
ttgtagcatt caagatcatg agcggtgagg 
tgcccgccat cctctcgcct ggagcccttg 
gccggcacgt tggcccgggc gagggggcag 
cctcccgggg gaaccatgtt tcccccacgc 
gcgtcactgc catactcagc agcctcactg 
ggataagctc ggagtgtacc actccatgct 
ggatatgcga ggtgct:gagc gactttaaga 
tgcctgggat tccctttgtg tcctgccagc 
gcattatgca cactcgctgc cactgtggag 
cgatgaggat cgtcggtcct aggacctgca 
acgcctacac cacgggcccc tgtactcccc 
ggagggtgtc tgcagaggaa tacgtggaga 
cgggtatgac tactgacaat cttaaatgcc 
cagaattgga cggggtgcgc ctacacaggt 
aggaggtatc attcagagta ggactccacg 
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actcctctcc accagcagtg ccccagagct 4020 
gcagcggtaa gagcaccaag gtcccggctg 4080 
tgctcaaccc ctctgttgct gcaacgctgg 4140 
gggttgatcc taatatcagg accggggtga 4200 
actccaccta cggcaagttc cttgccgacg 4260 
taatttgtga cgagtgccac tccacggatg 4320 
ttgaccaagc agagactgcg ggggcgagac 4380 
gctccgtcac tgtgtcccat cctaacatcg 4440 
tcccctttta cggcaaggct atccccctcg 4500 
tcCgccactc aaagaagaag tgcgacgagc 4560 
atgccgtggc ctactaccgc ggtcttgacg 4620 
tcgtcgtgtc gaccgatgct ctcatgactg 4680 
actgcaacac gtgtgtoact cagacagtcg 4740 
agacaaccac gctcccccag gatgctgtct 4800 
gggggaagcc aggcatctat agatttgtgg 4860 
actcgtccgt cctctgtgag tgctatgacg 4920 
ccgagactac agttaggcta cgagcgtaca 4980 
accatcttga attttgggag ggcgtcttta 5040 
tatcccagac aaagcagagt ggggagaact 5100 
tgtgcgctag ggctcaagcc cctcccccat 51S0 
gccttaaacc caccctccat gggccaacac 5220 
atgaagtcac cctgacgcac ccaatcacca 5280 
fcggaggtcgt cacgagcacc tgggtgctcg 5340 
attgcctgtc aacaggctgc gtggtcatag 5400 
caattatacc tgacagggag gttctctacc 5460 
agcacttacc gtacatcgag caagggatga 5520 
tcggcctcct gcagaccgcg tcccgccatg 5580 
actggcagaa actcgaggtc ttttgggcga 564 0 
aatacttggc gggcctgtca acgctgcctg 5700 
ttacagctgc cgtcaccagc ccactaacca 5760 
gggggtgggt ggctgcccag ctcgccgccc 5820 
gcctagctgg cgccgccatc ggcagcgttg 5880 
cagggtatgg cgcgggcgtg gogggagctc 5940 
tcccctccac ggaggacctg gtcaatctgc SOOO 
tagtcggtgt ggtctgcgca gcaatactgc 6060 
tgcaatggat gaaccggcta atagccttcg 6120 
actacgtgcc ggagagcgat gcagccgcoc 5180 
taacccagct cctgaggoga ctgcatoagt 6240 
ccggttcctg gctaagggac atctgggact S300 
cctggctgaa agccaagctc atgccacaac 6360 
gcgggtatag gggggtctgg cgaggagacg 6420 
ctgagatcac tggacatgtc aaaaacggga 6480 
ggaacatgtg gagtgggacg ttccccatta 6540 
ttcctgcgcc gaactataag ttcgcgctgt 6600 
taaggcgggt gggggacttc cactacgtat 6660 
cgtgccagat cccatcgcce gaatttctca 6720 
ttgcgccccc ttgcaagccc ttgctgcggg 6780 
agtacccggt ggggtcgcaa ttaccttgcg 6840 
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agcccgaacc ggacgtagcc gtgttgacgt 
cagaggcggc cgggagaagg ttggcgagag 
ctagccagct gtccgctcca tctctcaagg 
acgccgagct catagaggct aacctcctgt 
gggttgagtc agagaacaaa gtggtgattc 
aggatgagcg ggaggtctcc gtacctgcag 
gggccctgcc cgtGtgggcg cggccggact 
agcctgacta cgaaccacct gtggtccatg 
ctgtgcctcc gcctcggaaa aagcgtacgg 
ccttggccga gcttgccacc aaaagttttg 
acaatacgac aacatcctct gagcccgccc 
agtcctattc ttccatgccc cccctggagg 
ggtcatggtc gacggtcagt agtggggccg 
cttattcctg gacaggcgca ctcgtcaccc 
tcaacgcact gagcaactcg ttgctacgcc 
gcagtgcttg ccaaaggcag aagaaagtca 
attaccagga cgtgctcaag gaggtcaaag 
tatccgtaga ggaagcttgc agcctgacgc 
atggggcaaa agacgtccgt tgccatgcca 
ggaaagacct tctggaagac agtgtaacac 
aggttttctg cgttcagcct gagaaggggg 
ccgacctggg cgtgcgcgtg tgcgagaaga 
ccctggccgt gatgggaagc tcctacggat 
tcctcgtgca agcgtggaag tccaagaaga 
gttttgactc cacagtcact gagagcgaca 
gtgacctgga cccccaagcc cgogtggcca 
ggggccctct taccaattca aggggggaaa 
gcgtactgac aactagctgt ggtaacaccc 
gtcgagccgc agggctccag gactgcacca 
tctgtgaaag tgcgggggtc caggaggacg 
tgaccaggta ctccgccccc cccggggacc 
taacatcatg ctcctccaac gtgtcagtcg 
accttacccg tgaccctaca acccccctcg 
ctccagtcaa ttcctggcta ggcaacataa 
tgatactgat gacccatttc tttagcgtcc 
ttaactgtga gatctacgga gcctgctact 
ttcaaagact ccatggcctc agcgcatttt 
atagggtggc cgcatgcctc agaaaacttg 
gggcccggag cgtccgcgct aggcttctgt 
agtacctctt caactgggca gtaagaacaa 
gccggctgga cttgtccggt tggttcacgg 
gcgtgtctca tgcccggccc cgcbggttct 
taggcatcta cctcctcccc aaccgatgaa 
catttcctgt tttttttttt tttttttttt 
ttcttttttt cctttctttt tcccttcttt 
gctagcbgtg aaaggtccgt gagccgcatg 
gcagatcatg t 



ccaugcccac 


tgatccctcc 


catataacag 


6900 


ggtcaccccc 


ccccacggcc 


agctcctcgg 


6960 


caact^t^gcac 


cgccaaccat 


gactcccctg 


7020 


ggaggcagga 


gatgggcggc 


aacatcacca 


7080 


tggactcctt 


cgatccgctt 


gtggcagagg 


7140 


aaattctgcg 


gaagtctcgg 


agattcgccc 


7200 


acaacccccc 


gctagtagag 


acgtggaaaa 


7260 


gctgcccgct 


accacctcca 


cggtcccctc 


7320 


tggtcctcac 


cgaatcaacc 


ctatctactg 


7380 


gcagctcctc 


aacttccggc 


attacgggcg 


7440 


cttctggctg 


cccccccgac 


tccgacgttg 


7500 


gggagcctgg 


ggatccggat 


ctcagcgacg 


7560 


acacggaaga 


tgtcgtgtgc 


tgctcaatgt 


7620 


cgtgcgctgc 


ggaagaacaa 


aaactgccca 


7680 


atcacaatct 


ggtgtattcc 


accacbtcac 


7740 


catttgacag 


actgcaagtt 


ctggacagcc 


7800 


cagcggcgtc 


aaaagtgaag 


gctaacttgc 


7860 


ccccacattc 


agccaaatcc 


aagtttggot 


7920 


gaaaggccgt 


agcccacatc 


aactccgtgt 


7980 


caatagacac 


taccatcatg 


gccaagaacg 


8040 


gtcgtaagcc 


agctcgtctc 


atcgtgttcc^.S 10 0 1; 


tggccctgta 


cgacgtggtt 


agcaagctcc; 8160 : 


tccaatactc 


accaggacag 


cgggttgaat 


8220- 


ccccgatggg 


gttctcgtat 


gatacccgct 


8280- 


tccgtacgga 


ggaggcaatt 


taccaatgtt 


8340 


tcaagtccct 


cactgagagg 


ctttatgttg 


8400 


actgcggcta 


ccgcaggtgc 


cgcgcgagcg 


8460 


tcacttgcta 


catcaaggcc 


cgggcagcct 


8520 


tgctcgtgtg 


tggcgacgap 


ttagtcgtta 


8580 


cggcgagoct 


gagagccttc 


acggaggcta 


8640 


ccccacaacc 


agaatacgac 


ttggagctta 


8700 


cccacgacgg 


cgctggaaag 


agggtctact 


8760 


cgagagccgc 


gtgggagaca 


gcaagacaca 


8820 


tcatgtttgc 


ccccacactg 


tgggcgagga 


8880 


tcatagccag 


ggatcagctt 


gaacaggctc 


8940 


ccatagaacc 


actggatcta 


cctccaatca 


9000 


cactccacag 


ttactctcca 


ggtgaaatca 


9060 


gggtcccgcc 


cttgcgagct 


tggagacacc 


9120- 


ccagaggagg 


cagggctgct 


atatgtggca 


9180 


agctcaaact 


cactccaata 


gcggccgctg 


9240 


ctggctacag 


cgggggagac 


atttatcaca 


9300* 


ggttttgcct 


actcctgctc 


gctgcagggg 


9360 


ggttggggta 


aacactccgg 


cctcttaagc 


9420 


tttttttctt 


tctttttttc 


tttcctttcc 


9480 


aatggtggct 


ccatcttagc 


ccfcagtcacg 


9540 


actgcagaga 


gtgctgatac 


tggcctctct 


9600 
9611 
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<210> 6 

<211> 3015 
<212> PRT 

<213> Hepatitis C virus 



<400> 6 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys He Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin -Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 
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Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp lie 

225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 

245 250 255 



Gly Leu Arg Thr His lie Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly val Met Leu Ala Ala 
275 280 285 

Gin Met Phe lie Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 

290 295 300 

Asn Cys Ser lie Tyr Pro Gly Thr lie Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met lie Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val lie lie Asp lie lie Ser Gly Ala His 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys lie Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe lie Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser lie Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 
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Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 605 

Pro Arg Cys Leu lie Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cyg 
610 615 620 

Thr Val Asn Tyr Thr lie Phe Lys He Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala He Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His din Asn He Val Asp Val Gin 
S90 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 
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Ala Cys Leu Trp Met Leu He Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Asn Leu Val He Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly 
755 760 765 

Leu Val Ser Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly 
770 775 780 

Arg Trp Val Pro Gly Ala Val Tyr Ala Leu Tyr Gly Met Trp Pro Leu 
783 790 795 800 

Leu Leu Leu Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr 

805 810 815 

Glu Val Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp 
835 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
850 855 . 860 

Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu 
865 870 875 880 

Met Cys Val Val His Pro Thr Leu Val Phe Asp He Thr Lys Leu Leu 
885 890 895 

Leu Ala He Phe Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lys He Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val Val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 985 990 
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Asp Thr Ala Ala Cys Gly Asp lie He Asn Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Axg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 1100 

Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
1105 1110 1115 1120 

Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1125 1130 1135 

Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
1155 IISO 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He 
1185 1190 1195 1200 

Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 
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Ala Ala Gin Qly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro 
1265 1270 1275 1280 

Asa lie Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu 
X365 1370 1375 

He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 

Thr Asp Ala .Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 
1490 1495 1500 
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Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 
1505 1510 1513 1520 

Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 
1620 1625 1630 

Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr 
1635 1640 1645 

He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
1665 1670 1675 1680 

Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 
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Leu Glu Val Phe Trp Ala Lys His Met Trp A.sn Phe He Ser Gly He 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala 
1780 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 . 1830 1835 1840 

Ala Ala lie Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
1860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
1875 1880 1885 

Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 1960 1965 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 
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Arg Gly Tyr Arg Qly Val Trp Arg Gly Asp Gly He Met His Thr Arg 
2020 202S 2030 

Cys His Cys Gly Ala Glu lie Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro lie Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 

Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
2085 2090 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu 
2130 2135 2140 

Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys Val Val He 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 
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Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn 
2355 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2385 ■ 2390 2395 2400 



Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro He Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Qln Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 
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Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr 
2545 2550 2555 2560 

Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val aiu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp 
2645 2650 2655 

He Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 2S65 2670 

Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala- Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 
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Qlu Leu lie Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Qly 
2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu lie Ala Arg Asp Oln Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys 
2900 2905 2910 

Leu Arg Lya Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He 
2930 2935 2940 

cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro He Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

He Tyr Leu Leu Pro Asn Arg 
3010 3015 



<210> 7 
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<211> 9611 
<212> DUA 

<213> Hepatitis C virus 



<400> 7 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gataaacccg 
gcaagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaaga aacaccaacc 
gcggccagat cgttggcgga gtatacttgt 
gcgcgacaag gaagacttcg gagcggtccc 
aagatcggcg ctccactggc aaatcctggg 
ggaatgaggg actcggctgg gcaggatggc 
9999ccccaa tgacccccgg cataggtcgc 
cgtgcggctt tgccgacctc atggggtaca 
tcgccagagc tctcgcgcat ggcgtgagag 
ggaacttacc cggttgctcc ttttctatct 
ccccggtctc cgctgccgaa gtgaagaaca 
gcaccaatga cagcattacc tggcagctcc 
tcccgtgcga gaaagtgggg aatgcatctc 
ccgtgcagcg gcccggcgcc ctcacgcagg 
tgtccgccac gctctgctct gccctctacg 
cagcccaaat gttcattgtc tcgccgcagc 
ccatctaccc tggtaccatc • actggacacc 
cgcccacggc taccatgatc ttggcgtacg 
tcattagcgg ggctcattgg ggcgtcatgt 
cgtgggcgaa agtcgttgtc atccttctgt 
ctgttggggg ttctgccgcg cagaccaccg 
ccaggcagaa aatccagctc gttaacacca 
tgaactgcaa tgactccttg cacaccggct 
tcaactcgtc aggatgtccc gaacgcatgt 
tgggatgggg cgccttgcaa tatgaggata 
attgctggca ctacccacca aggcagtgtg 
cagtgtactg tttcaccccc agcccagtgg 
ccacttacac gtggggggag aatgagacag 
cgctggggtc atggttcggc tgcacgtgga 
gcgcaccaoc ctgccgtact agagctgact 
cggactgttt taggaagcat cctgatacca 
tcacgccaag gtgcctgatc gactacccct 
actataccat cttcaaaata aggatgtatg 
catgcaattt cactcgtggg gatcgttgca 
ctcctttgtt gcactccacc acggaatggg 
ccgccttgtc gactggtctt ctccacctcc 
atggcctatc acctgccctc acaaaataca 
tcctgctctt agcggacgcc agggtttgcg 
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catgaatcac tcccctgtga ggaactactg 60 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcctgatagg 300 
gaccgtgcac catgagcaca aatcctaaac 360 
gtcgcccaca agacgttaag tttccgggcg 420 
tgccgcgcag gggccccagg ttgggtgtgc 4 80 
agccacgtgg aaggcgccag cccatcccta 540 
gaaaaccagg atacccctgg cccctatacg 600 
tcctgtcccc ccgaggttcc cgtccctctt 660 
gcaacgtggg taaggtcatc gataccctaa 720 
tccctgtcgt gggcgccccg ctcggcggcg 780 
tcctggagga cggggttaat tttgcaacag 840 
tcttgctggc cctgctgtcc tgcatcacca 900 
bcagtaccgg ctacatggtg ^ctaacgact 960 
aggctgctgt cctccacgtc cccgggtgcg 1020 
agtgctggat accggtctca ccgaatgtgg 1080 
gcttgcggac gcacatcgac atggttgtga 1140 
tgggggaccfc ctgcggtggg gtgatgctcg 1200 
accactggtt tgtccaagac tgcaattgct 1260 
gcatggcatg ggacatgatg atgaactggt 1320 
cgatgcgtgt ccccgaggtc attatagaca 13 80 
tcggcttggc ctacttctot atgcagggag 1440 
tggccgccgg ggtggacgcg cgcacccata 1500 
ggcgcctcac cagcttattt gacatgggcc 1560 
atggcagctg gcacatcaac cgcaccgccc 1620 
ttatcgcgtc tctgttctac acccacagct 1680 
ccgcctgccg cagtatcgag gcottcoggg 1740 
atgtcaccaa tccagaggat atgagaccct 1800 
gcgtggtctc cgcgaagact gtgtgtggcc 1860 
tagtgggcac gaccgacagg cttggagcgc 1920 
atgtcttcct attgaacagc actcgaccac 1980 
tgaactcttc tggctacacc aagacttgcg 2040 
tcaacgccag cacggacctg ttgtgcccca 2100 
cttacctcaa atgcggctct gggccctggc 2160 
acaggctctg goattaccGc tgcacagtta 2220 
tgggaggggt tgagcacagg ctcacggctg 2280 
acttggagga cagagacaga agtcaactgt 2340 
ccattttacc ttgctcttac tcggacctgc 24 00 
accaaaacat cgtggacgta caattcatgt 2460 
tcgtccgatg ggagtgggta atactcttat 2520 
cctgcttatg gatgctcatc ttgttgggcc 2580 
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aggccgaagc agcactagag aagctggtca tcttgcacgc tgcgagcgca gctagctgca 2640 
atggcttcct atattttgtc atctttttcg tggctgcttg gtacatcaag ggtcgggtag 2700 
tccccttagc tacctattcc ctcactggcc tgtggtcctt tagcctactg ctcctagcat 2760 
tgccccaaca ggcatatgca ctggacacgg aggtggccgc gtcgtgtggc ggcgttgttc 2820 
ttgtcgggtt aatggcgctg actctgtcgc catattacaa gcgctatatc agctggtgca 2880 
tgtggtggct tcagtatttt ctgaccagag tagaagcgca actgcacgtg tgggttcccc 2940 
ccctcaacgt ccgggggggg cgcgatgccg tcatcttact catgtgtgta gtacacccga 3000 
ccctggtatt tgacatcacc aaactactcc tggccatctt cggacecctt tggattcttc 3060 
aagccagttt gcttaaagtc ccctacttcg tgcgcgttca aggccttctc cggatctgcg 3120 
ogotagcgcg gaagatagcc ggaggtcatt acgtgcaaat ggccatcatc aagttagggg 3180 
cgettactgg cacctatgtg tataaccatc tcacccctct tcgagactgg gcgcacaacg 3240 
gcctgcgaga tctggccgtg gctgtggaac cagtcgtctt ctcocgaatg gagaccaagc 3300 
tcatcacgtg gggggcagat aecgccgcgt gcggtgacat catcaacggc ttgcccgtct 3360 
ctgcccgtag gggccaggag atactgcttg ggccagccga cggaatggtc tccaaggggt 3420 
ggaggttgct ggcgcccatc acggcgtacg cccagcagac gagaggcctc ctagggtgta 3480 
taatcaccag cctgactggc cgggacaaaa accaagtgga gggtgaggtc cagatcgtgt 3540 
caactgctac ccaaaccttc ctggcaacgt gcatcaatgg ggtatgctgg actgtctacc 3600 
acggggccgg aacgaggacc atcgcatcac ccaagggtcc tgtcatccag atgtatacca 3660 
atgtggacca agaccttgtg ggctggcccg ctcctcaagg ttcccgctca ttgacaccct 3720 
gtacctgcgg ctcCtcggac ctttacctgg tcacgaggca cgccgatgtc attcccgtgc 3780 
gccggcgagg tgatagcagg ggtagcctgc tttcgccccg gcccatttcc tacttgaaagia3840 
gctcctcggg gggtccgctg ttgtgccccg cgggacacgc cgtgggccta ttcagggccg -3900 , 
cggtgtgcac ccgtggagtg gataaagcgg tggaatttat ccctgtggag aacctaggga 3960 , 
caaccatgag atccccggtg ttcacggaca actcctctcc accagcagtg ccccagagct 4020 , 
tccaggtggc ccacctgcat gctcccaccg gcagcggtaa gagcaccaag gtcccggctg 4080 
cgtacgcagc ccagggctac aaggtgttgg tgctcaaccc ctctgttgct gcaacgctgg 4140 
gctttggtgc ttacatgtcc aaggcccatg gggttgatoc taatatcagg accggggtga 4200 
gaacaattac cactggcagc cccatcacgt actccaccta cggcaagttc cttgccgacg 4260 
gcgggtgotc aggaggtgct tatgacataa taatttgtga cgagtgccac tccacggatg 4320 
ceacatccat ottgggcatc ggcactgtcc ttgaccaagc agagactgcg ggggcgagac 4380 
tggttgtgct cgccactgct acccctccgg gctccgtcac tgtgtcccat cctaacatcg 4440 
aggaggttgc tctgtccacc accggagaga tcccctttta cggcaaggct atccccctcg 4500 
aggtgatcaa ggggggaaga catctcatct tctgccactc aaagaagaag tgcgacgagc 4560 
tcgccgcgaa gctggtcgca ttgggcatca atgccgtggc ctactaccgc ggtcttgacg 4620 
tgtctgtcat cccgaccagc ggcgatgttg tcgtcgtgtc gaccgatgct ctcatgactg 4680 
gctttaccgg cgacttcgac tctgtgatag actgcaacac gtgtgtcact cagacagtcg 4740 
atttcagcct tgaccctacc tttaccattg agacaaccac gctcccccag gatgctgtct 4800 
ccaggaotca acgccggggc aggactggca gggggaagce aggcatctat agatttgtgg 4860 
caccggggga gcgccccCcc ggcatgttcg actcgtccgt cctctgtgag tgctatgacg 4920 „ 
cgggctgtgo ttggtatgag ctcaegcccg ccgagactac agttaggcta cgagcgtaca 4980 ,. 
tgaacacccc ggggcttccc gtgtgccagg accatcttga attttgggag ggcgtottta 5040 
cgggcctcac toatatagat gcccactttt tatcccagac aaagcagagt ggggagaact 5100 
ttccttacct ggtagcgtac caagccacog tgtgcgctag ggctcaagcc cctcccccat 5160 
cgtgggacca gatgtggaag tgtttgatcc gccttaaacc caccctccat gggccaacac 5220 
ccctgctata cagactgggc gctgttcaga atgaagtcac cctgacgcac ccaatcacca 5280 
aatacatcat gacatgcatg tcggccgacc tggaggtcgt cacgagcacc tgggtgctcg 5340 
ttggcggcgt cetggctgct ctggccgcgt attgcctgtc aacaggctgc gtggtcatag 5400 
tgggcaggat cgtcttgtcc gggaagccgg caattatacc tgacagggag gttctctacc 5460 
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aggagttcga tgagatggaa gagtgctctc agcacttacc gtacatcgag caagggatga 5520 
tgctcgctga gcagttcaag cagaaggccc tcggcctcct gcagaccgcg tcccgccatg 5580 
cagaggttat cacccctgct gtccagacca actggcagaa actcgaggtc ttttgggcga 5S40 
agcacatgtg gaatttcatc agtgggatac aatacttggc gggcctgtca acgctgcctg 5700 
gtaaccccgc cattgcttca ttgatggctt ttacagctgc cgtcaccagc ccactaacca 5760 
ctggccaaac cctcctcttc aacatattgg gggggtgggt ggctgcccag ctcgccgccc 5820 
ccggtgccgc tactgccttt gtgggtgctg gcctagctgg cgccgccaCc ggcagcgttg 5880 
gactggggaa ggtcctcgtg gacattcttg cagggtatgg cgcgggcgtg gcgggagctc 5940 
ttgtagcatt caagatcatg agcggtgagg tcccctccac ggaggacctg gtcaatctgc 6000 
tgcccgccat cctctcgcct ggagcccttg tagtcggtgt ggtctgcgca gcaatactgc 6060 
gccggcacgt tggcccgggc gagggggcag tgcaatggat gaaccggcta atagccttcg 6120 
cctcccgggg gaaccatgtt tcccccacgc actacgtgcc ggagagcgat gcagccgccc 6180 
gcgtcactgc catactcagc agcctcactg taacccagct cctgaggcga ctgcatcagt 6240 
ggataagctc ggagtgtacc actccatgct ccggttcctg gctaagggac atctgggact 6300 
ggatatgcga ggtgctgagc gactttaaga cctggctgaa agccaagctc atgccacaac 6360 
tgcctgggat tccctttgtg tcctgccagc gcgggtatag gggggtctgg cgaggagacg 6420 
gcattatgca cactcgctgc cactgtggag ctgagatcac tggacatgtc aaaaacggga 6480 
cgatgaggat cgtcggtcct aggacctgca ggaacatgtg gagtgggacg ttccccatta 6540 
acgcctacac cacgggcccc tgtactcccc ttcctgcgcc gaactataag ttcgcgctgt 6600 
ggagggtgtc tgcagaggaa tacgtggaga taaggcgggt gggggacttc cactacgtat 6660 
cgggtatgac tactgacaat cttaaatgcc ogtgccagat cccatcgccc gaatttttca 6720 
cagaattgga cggggtgcgc ctacacaggt ttgcgccccc ttgcaagccc ttgctgcggg 6780 
aggaggtatc attcagagta ggactccacg agtacccggt ggggtcgcaa ttaccttgcg 6840 
agcccgaacc ggacgtagcc gtgttgacgt ccatgctcac tgatccctcc catataacag 6900 
cagaggcggc cgggagaagg ttggcgagag ggtcaccccc ttctatggcc agctcctcgg 6960 
ctagccagct gtccgctcca tctctcaagg caacttgcac cgccaaccat gactcccctg 7020 
acgccgagct catagaggct aacctcctgt ggaggcagga gatgggcggc aacatcacca 7080 
gggttgagtc agagaacaaa gtggtgattc tggactcctt cgatccgctt gtggcagagg 7140 
aggatgagcg ggaggtctcc gtacctgcag aaattctgcg gaagtctogg agattcgccc 7200 
gggccctgcc cgtctgggcg cggcoggact acaacccccc gctagtagag acgtggaaaa 7260 
agcctgacta cgaaccacct gtggtccatg gctgcccgct accacctcca' cggtcccctc 7320 
ctgtgcctcc gcotcggaaa aagcgtacgg tggtoctcac cgaatcaacc ctatctactg 7380 
ccttggccga gcttgccacc aaaagttttg gcagctcctc aacttccggc attacgggcg 7440 
acaatacgac aacatcctct gagcccgccc cttctggctg ccoccccgac tccgacgttg 7500 
agtcctattc ttccatgccc cccctggagg gggagcctgg ggatccggat ctcagcgacg 7560 
ggtcatggtc gacggtcagt agtggggccg acacggaaga tgtcgtgtgc tgctcaatgt 7620 
cttattcctg gacaggcgca ctcgtoaccc cgtgcgctgc ggaagaaoaa aaactgccca 7680 
tcaacgcact gagcaactcg ttgctacgcc atoacaatct ggtgtattcc accacttcac 7740 
gcagtgcttg ccaaaggcag aagaaagtca catttgacag actgcaagtt ctggacagcc 7800 
attaccagga ogtgctcaag gaggtcaaag cagcggcgtc aaaagtgaag gctaaettgc 7860 
tatccgtaga ggaagcttgc agcctgacgc ccccacattc agccaaatcc aagtttggct 7920 
atggggcaaa agacgtccgt tgccatgcca gaaaggccgt agcccacatc aactccgtgt 7980 
ggaaagacct tctggaagac agtgtaacac caatagacac taccatcatg gccaagaacg 8040 
aggtcttctg cgttcagcct gagaaggggg gtcgtaagcc agctcgtctc atcgtgttcc 8100 
ccgacctggg cgtgcgcgtg tgcgagaaga tggccctgta cgacgtggtt agcaagctcc 8160 
ccctggccgt gatgggaago tcctacggat tccaatactc accaggacag cgggttgaat 8220 
tcctcgtgca agcgtggaag tccaagaaga ccccgatggg gttctcgtat gatacccgct 8280 
gttttgactc cacagtcact gagagcgaca tccgtacgga ggaggcaatt taccaatgtt 8340 
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gtgacctgga cccccaagcc cgcgtggcca tcaagtccct cactgagagg ctttatgttg 8400 
ggggccctct taccaattca aggggggaaa actgcggcta ccgcaggtgc cgcgcgagcg 8460 
gcgtactgac aactagctgt ggtaacaccc tcacttgcta catcaaggcc cgggcagcct 8520 
gtcgagccgc agggctccag gactgcacoa tgctcgtgtg tggcgacgac ttagtcgtta 8580 
tctgtgaaag tgcgggggtc caggaggacg cggcgagcct gagagccttc acggaggcta 8640 
tgaccaggta ctccgccccc cccggggacc ccccacaacc agaatacgac ttggagctta 8700 
taacatcatg ctcctccaac gtgtcagtcg cccacgacgg cgctggaaag agggtctact 8760 
accttacccg tgaccctaca acccccctcg cgagagccgc gtgggagaca gcaagacaca 8820 
ctccagtcaa ttcctggcta ggcaacataa tcatgtttgc ccccacactg tgggcgagga 8880 
tgatactgat gacccatttc tttagcgtcc tcatagccag ggatcagctt gaacaggctc 8940 
ttaactgtga gatctacgga gcctgctact ccatagaacc actggatcta cctccaatca 9000 
ttcaaagact ccatggcctc agcgcatttt cactccacag ttactctcca ggtgaaatca 9060 
atagggtggc cgcatgoctc agaaaacttg gggtcccgcc cttgcgagct tggagacacc 9120 
gggcccggag cgtccgcgct aggcttctgt ccagaggagg cagggctgct atatgtggca 9180 
agtacctctt caactgggca gtaagaacaa agctcaaact cactccaata gcggccgctg 9240 
gccggctgga cttgtccggt tggttcacgg ctggctacag cgggggagac atttatcaca 9300 
gcgtgtctca tgcccggccc cgctggttct ggttttgcct actcctgctc gctgcagggg 9360 
taggcatcta cctcctcccc aaccgatgaa ggttggggta aacactccgg cctcttaagc 9420 
catttcctgt tttttttttt tttttttttt tttttttctt tttttttttc tttcctttcc 9480 
ttcttttttt cctttctttt tcccttcttt aatggtggct ccatcttagc cctagtcacg 9540 
gctagctgtg aaaggtccgt gagccgcatg actgcagaga gtgctgaCac tggoctctct ;9600 
gcagatcatg t 79611 



<210> 8 
<211> 3015 
<212> PRT 

<213> Hepatitis C virus 
<400> 8 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 IS 

Arg, Arg Pro Gin Asp Val Lya Phe Pro Gly Oly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Al'a 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

He Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 
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tieu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 ICS 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Sly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Oly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 ISO 155 160 

Gly Val Aan Phe Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser He 
1S5 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys, Asn He Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser He Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 

Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp He 
225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 

275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He He Ser Gly Ala His 
340 345 350 
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Trp Gly val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala hys Val Val Val lie Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys lie Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His lie Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe lie Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser lie Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
485 490 495 

Gly Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
SIS 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 , 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 605 
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Pro Arg Cys Leu He Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
SIO 615 620 

Thr Val Asn Tyr Thr lie Phe Lys He Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys- 
S45 650 655 

Asn. Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala He Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly I,eu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 

Ala Cys Leu Trp Met Leu He Leu Leu Gly Gin Ala Glu Ala Ala Leu 
740 745 750 

Glu Lys Leu Val He Leu His Ala Ala Ser Ala Ala Ser Cys Asn Gly 
755 760 765 

Phe Leu Tyr Phe Val He Phe Phe Val Ala Ala Trp Tyr He Lys Gly 
770 775 780 

Arg Val Val Pro Leu Ala Thr Tyr Ser Leu Thr Gly Leu Trp Ser Phe 
785 790 795 800 

Ser Leu Leu Leu Leu Ala Leu Pro Gin Gin Ala Tyr Ala Leu Asp Thr 
805 810 815 

Glu Val Ala Ala Ser Cys Gly- Gly Val Val Leu Val Gly Leu Met Ala 
B20 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp 
335 840 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
850 855 860 
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Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val lie Leu Leu 
865 870 875 880 

Met Cys Val Val His Pro Thr Leu Val Phe Asp He Thr Lys Leu Leu 
885 890 895 

Leu Ala He Phe Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lys lie Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val Val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 985 990 

Asp Thr Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala 
995 1000 1005 

Arg Arg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lys Sly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 

1090 1095 1100 

Tyr Thr Asn val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
1105 1110 1115 ' 1120 
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Ser Arg Ser Leu Thr Pro Cya Thr Cys Qly Ser Ser Asp Leu Tyr Leu 
112S 1130 1135 

Val Thr Arg His Ala Asp Val lie Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Qly Ser 
1155 1160 11S5 

Ser Gly Gly Pro Leu Leu Cya Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1130 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He 
lias 1190 1195 1200 

Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 

Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro 
1265 1270 1275 1280 

Asn He Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Aspi Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 13S0 

Val Ser His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu 
1365 1370 1375 
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lie Pro Phe Tyr Gly Lys Ala lie Pro Leu Qlu Val lie Lys Gly Gly 
1380 1385 1390 

Arg His Leu lie Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val lie Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 



Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val lie 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr lie Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly lie Tyr Arg 
1490 1495 1500 

Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 
1505 1510 1515 1520 



Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 

1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 

1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 

1555 1560 1565 

Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 

1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 

1585 1590 1595 1600 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie 

1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu 

1620 1625 1630 
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Gly Ala Val Gin Asn Glu Val Thr beu Thr His Pro He Thr Lys Tyr 
1635 1640 1645 



He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1650 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
1665 1670 1675 1680 

Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 



Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 

Leu Glu Val Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He 
1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala 
1780 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 

Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 1830 1835 1840 

Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
I860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
1875 1880 1885 
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Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 1895 1900 

lie Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu, His Gin Trp He 
1955 1960 1965 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 * 2010 2015 

Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg 
2020 2025 2030 

Cys His Cys Gly Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met 
2035 2040 2045 

Arg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 

2065 2070 2075 2080 

Asn Tyr Lya Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
20B5 2090 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu 
2130 2135 2140 
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Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Qlu Tyr Pro Val 
2145 2150 2155 21S0 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Aan Lys Val Val He 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 ' 2270 

Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr 
2290 2295 2300 ' 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn 
. 2355 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
2370 2375 2380 

Asp Val Qlu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2385 2390 2395 2400 
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Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2425 2430 

Ala Leu Val Thr Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro lie Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 ' 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr 
2545 2550 2555 2560 

Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2565 2570 2575 

Gly Arg Lys Pro Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp 
2645 2650 2655 
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lie Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 2665 2670 

Ala Arg Val Ala He Lya Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725. 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 

Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
2785 2790 2795 2800 

Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys 
2900 2905 2910 
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Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala lie 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro lie Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val Ser His Ala Arg 
2980 2965 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

He Tyr Leu Leu Pro Asn Arg 
3010 3015 



<210> 9 
<211> 9611 
<212> DNA 

<213> Hepatitis C virus 
<400> 9 

gccagccccc tgatgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
ccccectccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gataaacccg 
gcaagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaaga aacaccaacc 
gcggccagat cgttggcgga gtatacttgt 
gcgcgacaag gaagacttcg gagcggtccc 
aagatcggcg ctccactggc aaatcctggg 
ggaatgaggg actcggctgg gcaggatggc 
ggggccccaa tgacccccgg cataggtcgc 
cgtgcggctt tgccgacctc atggggtaca 
tcgccagagc tctcgcgcat ggcgtgagag 
ggaacttacc cggttgctcc ttttctatct 
ccccggtctc cgctgccgaa gtgaagaaca 
gcaccaatga cagcattacc tggcagctcc 
tcccgtgcga gaaagtgggg aatgcatctc 
ccgtgcagcg gcccggcgcc ctcacgcagg 
tgtccgccac gctctgctct gccctctacg 

63 



catgaatcac tcccctgtga ggaactactg €0 
ttagtatgag tgtcgtgcag cctccaggac 120 
cggaaccggt gagtacaccg gaattgccag 180 
ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgaaaggcc ttgtggtact gcotgatagg 300 
gaccgtgcac catgagcaca aatcctaaac 360 
gtcgcccaca agacgttaag tttccgggcg 420 
tgccgcgcag gggccccagg ttgggtgtgc 480 
agccacgtgg aaggcgccag cccatcccta 540 
gaaaaccagg atacccctgg cccctatacg €00 
tcctgtcccc ccgaggttcc cgtccctctt S60 
gcaacgtggg taaggtcatc gataccctaa 720 
tccctgtcgt gggcgccccg ctcggcggcg 780 
tcctggagga cggggttaat tttgcaacag 840 
tcttgctggc cctgctgtcc tgcatcacca 900 
tcagtaccgg ctacatggtg actaacgact 960 
aggctgctgt cctccacgtc cccgggtgcg 1020 
agtgctggat accggtctca ccgaatgtgg 1080 
gcttgcggac gcacatcgac atggttgtga 1140 
tgggggacct ctgcggtgg^ gtgatgctcg 1200 
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cagcccaaat gttcattgtc tcgccgcagc accaccggtt tgtccaagac tgcaattgct 1260 
ccatctaccc tggtaccatc actggacacc gcatggcatg ggacatgatg atgaactggt 1320 
cgcccacggc taccatgatc ttggcgtacg cgatgcgtgt ccccgaggtc attatagaca 1380 
tcattagcgg ggctcattgg ggcgtcatgt tcggcttggc ctacttctct atgcagggag 1440 
cgtgggcgaa agtcgttgtc atccttctgt tggccgccgg ggtggacgcg cgcacccata 1500 
ctgttggggg ttctgccgcg cagaccaccg ggcgcctcac cagcttattt gacatgggcc 1560 
ccaggcagaa aatccagctc gttaacacca atggcagctg gcacatoaac cgcaccgccc 1620 
tgaactgcaa tgactccttg cacaccggct ttatcgcgtc tctgttctac acccacagct 1680 
tcaactcgtc aggatgtccc gaacgcatgt ccgcctgccg cagtatcgag gccttccggg 1740 
tgggatgggg cgccttgcaa tatgaggata atgtcaccaa tccagaggat atgagaccct 1800 
attgctggca ctacccacca aggcagtgtg gogtggtcCc cgcgaagact gtgtgtggcc 1860 
cagtgtactg tttcaccccc agcccagtgg tagtgggcac gaccgacagg cttggagcgc 1920 
ccacttacac gtggggggag aatgagacag atgtcttcct attgaacagc actcgaccac 1980 
cgctggggtc atggttcggc tgcacgtgga tgaactcttc tggctacacc aagacttgcg 2040 
gcgcaccacc ctgccgtact agagctgact tcaacgccag cacggacctg ttgtgcccca 2100 
cggactgttt taggaagcat cctgatacca cttacctcaa atgcggctct gggccctggc 2160 
tcacgccaag gtgcctgatc gactacccct acaggctctg gcattacccc tgcacagtta 2220 
actataccat cttcaaaata aggatgtatg tgggaggggt tgagcacagg ctcacggctg 2280 
catgcaattt cactcgtggg gatcgttgca acttggagga cagagacaga agtcaactgt 2340 
ctcctttgtt gcactccacc acggaatggg ccattttacc ttgctcttac tcggacctgc 2400 
ccgccttgtc gactggtctt ctccacctcc accaaaacat cgtggacgta caattcatgt 2460 
atggcctatc acctgccotc acaaaataca tcgtccgatg ggagtgggta atactcttat 2520 
tcctgctctt agcggacgcc agggtttgcg cctgcttatg gatgctcatc ttgttgggcc 2580 
aggccgaagc agctttggag aacctcgtaa tactcaatgc agcatccctg gccgggacgc 2640 
acggtcttgt gtccttoctc gtgttcttct gctttgcgtg gtatctgaag ggtaggtggg 2700 
tgcccggagc ggtctacgcc ctctacggga tgtggcctct cctcctgctc ctgctggcgt 2760 
tgcctcagcg ggcatatgca ctggacacgg aggtggccgc gtcgtgtggc ggogttgttc 2820 
ttgtcgggtt aatggcgctg actctgtcgc catattacaa gcgctatatc agctggtgca 2880 
tgtggtggct tcagtatttt ctgaccagag tagaagcgca actgcacgtg tgggttcccc 2 940 
ccctcaacgt ccgggggggg cgcgatgccg tcatcttact catgtgtgta gtacacccga 3 000 
occtggtatt tgaoatcacc aaactactcc tggccatctt cggacccctt tggattcttc 3060 
aagccagttt gcttaaagtc ccctacttcg tgcgcgttca aggccttctc cggatctgcg 3120 
cgctagcgcg gaagatagcc ggaggtcatt acgtgcaaat ggccatcatc aagttagggg 3180 
cgcttactgg cacctatgtg tataaocatc tcacccctct tcgagactgg gcgcacaacg 3240 
gcctgcgaga tctggccgtg gctgtggaac cagtcgtctt ctcccgaatg gagaccaagc 3300 
tcatcacgtg gggggcagat accgccgcgt gcggtgacat catcaacggc ttgcccgtct 3360 
ctgcccgtag gggccaggag atactgctCg ggccagccga cggaatggtc tccaaggggt 3420 
ggaggttgct ggcgcccatc acggcgtacg cccagcagac gagaggcctc ctagggtgta 3480 
taatcaccag cctgactggc cgggacaaaa accaagtgga gggtgaggto cagatcgtgt 3540 
caactgctac ccaaaccttc ctggcaacgt gcatcaatgg ggtatgetgg actgtctacc 3600 
acggggccgg aacgaggacc atcgcatcac ccaagggtcc tgtcatccag atgtatacca 3660 
atgtggacca agaccttgtg ggctggcccg ctcctcaagg ttcccgctca ttgacaccct 3720 
gtacctgcgg ctcctcggac ctttacctgg tcacgaggca cgccgatgtc attcccgtgc 3780 
gccggcgagg tgatagcagg ggtagcctgc Cttcgccccg gcccatttcc tacttgaaag 3840 
gctcctcggg gggtccgctg ttgtgccccg cgggacacgc cgtgggccta ttcagggccg 3900 
cggtgtgcac ocgtggagtg gctaaagcgg tggactttat ccctgtggag aacctaggga 3960 
caaccatgag atccccggtg ttcacggaca actcotctcc accagcagtg ccccagagct 4020 
tccaggtggc ccacctgcat gctcccaccg gcagcggtaa gagcaccaag gtcccggctg 4080 
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cgtacgcagc ccagggctac aaggtgttgg tgctcaaccc ctctgttgct gcaacgctgg 4140 

gctttggtgc ttacatgtcc aaggcocatg gggttgatcc taatatcagg accggggtga 4200 

gaacaattac cactggcagc cccatcacgt actccaccta cggcaagttc cttgccgacg 4260 

gcgggtgctc aggaggtgct tatgacataa taatttgtga cgagtgccac tccacggatg 4320 

ccacatccat cttgggcatc ggcactgtcc ttgaccaagc agagactgcg ggggcgagac 4380 

tggttgtgct cgccactgct acccctccgg gctccgtcac tgtgtcccat cctaacatcg 4440 

aggaggttgc tctgtccacc accggagaga tcccctttta cggcaaggct atccccctcg 4500 

aggtgatcaa ggggggaaga catctcatct tctgccactc aaagaagaag tgcgacgagc 4560 

tcgccgcgaa gctggtcgca ttgggcatca atgccgtggc ctactaccgc ggtcttgacg 4620 

tgtctgtcat cccgaccagc ggcgatgttg tcgtcgtgtc gaccgatgct ctcatgactg 4680 

gctttaccgg cgacttcgac tctgtgatag actgcaacac gtgtgtcact cagacagtcg 4740 

atttcagcct tgaccctacc tttaccattg agacaaccac gctcccccag gatgctgtct 4800 

ccaggactca acgccggggc aggactggca gggggaagcc aggcatctat agatttgtgg 4860 

caccggggga gcgcccctcc ggcatgttcg actcgtccgt cctctgtgag tgctatgacg 4920 

cgggctgtgc ttggtatgag ctcacgcccg ccgagactac agttaggcta cgagcgtaca 4980 

tgaacacccc ggggcttccc gtgtgccagg accatcttga attttgggag ggcgtottta 5040 

cgggcctcac tcatatagat gcccactttt tatcccagac aaagcagagt ggggagaact 5100 

ttccttacct ggtagcgtac caagccaceg tgtgcgctag ggctcaagcc cctcccccat 5160 

cgtgggacca gatgtggaag tgtttgatcc gccttaaacc caccctccat gggccaacac 5220 

ccctgctata cagactgggc gctgttcaga atgaagtcac cctgacgcao ccaatcacca 5280 

aatacatcat gacatgcatg tcggccgacc tggaggtcgt cacgagcacc tgggtgctcg^5340 > 

ttggcggcgt cctggctgct ctggccgcgt attgcctgtc aacaggctgc gtggtcatag 5400 

tgggcaggat cgtcttgtcc gggaagccgg caattabacc tgacagggag gttctctacc 5460 ; 

aggagttcga tgagatggaa gagtgctctc agcacttacc gtacatcgag caagggatga 5520 ; 

tgctcgctga gcagttcaag cagaaggccc tcggcctcct gcagaccgcg tcccgccatg 5580 

cagaggttat cacccctgct gtccagacca actggcagaa actcgaggtc ttttgggcga 5640 

agcacatgtg gaatttcatc agCgggatac aatacttggc gggcctgtca acgctgc'ctg 5700 

gtaaccccgc cattgcttca ttgatggctt ttacagctgc cgtcaccagc ocactaacca 5760 
ctggccaaac cctcctcttc aacatattgg gggggtgggt ggctgcccag ctcgccgccc 5820 

ccggtgccgc tactgccttt gtgggtgctg gcctagctgg cgccgccatc ggcagcgttg 5880 

gactggggaa ggtcctcgtg gacattcttg cagggtatgg cgcgggcgtg gcgggagotc 5940 

ttgtagcatt caagatcatg agcggtgagg tcccctccac ggaggacctg gtcaatctgc 6000 

tgcccgccat cctctcgcct ggageccttg tagtcggtgt ggtctgcgca gcaatactgc 6060 

gccggcacgt tggcccgggc gagggggcag tgcaatggat gaaccggcta atagccttcg 6120 

cctcccgggg gaaccatgtt tcccccacgc actacgtgcc ggagagcgat gcagccgccc 6180 

gcgtcactgc catactcagc agcctcactg taacccagct cctgaggcga ctgcatcagt 6240 

ggataagctc ggagtgtacc aotocatgct ccggttcctg gctaagggac atctgggact 6300 

ggatatgcga ggtgctgagc gactttaaga cctggctgaa agccaagctc atgccacaac 6360., 

tgcctgggat tccctttgtg tcctgccagc gcgggtatag gggggtctgg cgaggagacg 6420:, 

gcattatgca cactcgctgc cactgtggag ctgagatcac tggacatgtc aaaaacggga 6480 , 

cgatgaggat cgtcggtcct aggacctgca ggaacatgtg gagtgggacg ttccccatta 6540, 

acgcctacac cacgggcccc tgtactcccc ttcctgcgcc gaactataag ttcgcgctgt 6600 

ggagggtgtc tgcagaggaa tacgtggaga taaggcgggt gggggacttc cactacgtat 6660 

cgggtatgac tactgacaat cttaaatgcc cgtgccagat cccatcgccc gaatttttca 6720 

cagaattgga cggggtgcgc ctacacaggt ttgcgccccc ttgcaagccc ttgctgcggg 6780 

aggaggtatc atteagagta ggaotccacg agtacccggt ggggtcgcaa ttaccttgcg 6840 

agcccgaacc ggacgtagcc gtgttgacgt ccatgctcac tgatccctcc catataacag 6900 

cagaggcggc cgggagaagg ttggcgagag . ggtcaccccc ttctatggcc agctcctcgg 6960 
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ctagccagct gtccgctcca tctctcaagg 
acgccgagct catagaggct aacctcctgt 
gggttgagtc agagaacaaa gtggtgattc 
aggatgagcg ggaggtctcc gtacctgcag 
gggccctgcc cgtctgggcg cggccggact 
agcctgacta cgaaccacct gtggtccatg 
ctgtgcctcc gcctcggaaa aagcgtacgg 
ccttggccga gcttgccacc aaaagttttg 
acaatacgac aacatcctct gagcccgccc 
agtcctattc ttccatgccc cccctggagg 
ggtcatggtc gacggtcagt agtggggccg 
cttattcctg gacaggcgca ctcgtcaccc 
tcaacgcact gagcaactcg ttgctacgcc 
gcagtgcttg ccaaaggcag aagaaagtca 
attaccagga cgtgctcaag gaggtcaaag 
tatccgtaga ggaagcttgc agcctgacgc 
atggggcaaa agacgtccgt tgccatgcca 
ggaaagacct tctggaagac agtgtaacac 
aggttttctg cgttcagcct gagaaggggg 
ccgacctggg cgtgcgcgtg tgcgagaaga 
ccctggccgt gatgggaagc tcctacggat 
tcctcgtgca agcgtggaag tccaagaaga 
gttttgactc caoagtcact gagagcgaca 
gtgacctgga cccccaagcc cgcgtggcca 
ggggccctct taccaattca aggggggaaa 
gcgtactgac aactagctgt ggtaacaccc 
gtcgagccgc agggctccag gactgcacca 
tctgtgaaag tgcgggggtc caggaggacg 
tgaccaggta ctccgccccc cccggggacc 
taacatcatg ctcctccaac gtgtcagtcg 
accttacccg tgaccctaca acccccctcg 
ctccagtcaa ttcctggcta ggcaacataa 
tgatactgat gacccatttc tttagcgtcc 
ttaactgtga gatctacgga gcctgctact 
ttcaaagact ccatggcctc agcgcatttt 
atagggtggc cgcatgcctc agaaaacttg 
gggcccggag cgtccgcgct aggcttctgt 
agtacctctt caactgggca gtaagaacaa 
gccggctgga cttgtccggt tggttcacgg 
gcgtgtctca tgcccggccc cgctggttct 
taggcabcta cctcctcccc aaccgatgaa 
catttcctgt tttttttttt tttttttttt 
ttcttttttt cctttctttt tcccttcttt 
gctagctgtg aaaggtccgt gagccgcatg 
gcagatcatg t 



caacttgcac cgccaaccat gactcccctg 7020 
ggaggcagga gatgggcggc aacatcacca. 7080 
tggactcctt cgatccgctt gtggcagagg 7140 
aaattctgcg gaagtctcgg agattcgccc 7200 
acaacccccc gctagtagag acgtggaaaa 7260 
gctgcccgct accacctcca cggtcccctc 7320 
tggtcctcac cgaatcaacc ctatctactg 7380 
gcagctcctc aacttccggc attacgggcg 7440 
cttctggctg cccccccgac tccgacgttg 7500 
gggagcctgg ggatccggat ctcagcgacg 7560 
acacggaaga tgtcgtgtgc tgctcaatgt 7620 
cgtgcgctgc ggaagaacaa aaactgccca 7680 
atcacaatct ggtgtattcc accacttcac 7740 
catttgacag actgcaagtt ctggacagcc 7800 
cagcggcgtc aaaagtgaag gctaacttgc 7860 
ccccacattc agccaaatcc aagtttggct 7920 
gaaaggccgt agcccacatc aactccgtgt 7980 
caatagacac taccatcatg gccaagaacg 8040 
gtcgtaagcc agctcgtctc atcgtgttcc 8100 
tggccctgta cgacgtggtt agcaagctcc 8160 
tccaatactc accaggacag cgggttgaat 8220 
ccccgatggg gttctcgtat gatacccgct 8280 
tccgtacgga ggaggcaatt taccaatgtt 8340 
tcaagtccct cactgagagg ctttatgttg 8400 
actgcggcta ccgcaggtgc cgcgcgagcg 8460 
tcacttgcta catcaaggcc cgggcagcct 8520 
tgctcgtgtg tggcgacgac ttagtcgtta 8580 
cggcgagcct gagagccttc acggaggcta 8640 
ccccacaacc agaatacgac ttggagctta 8700 
cccacgacgg cgctggaaag agggtctact 8760 
cgagagccgc gtgggagaca gcaagacaca 8820 
tcatgtttgc ccccacactg tgggcgagga 8880 
tcatagccag ggatcagctt gaacaggctc 8940 
ccacagaacc actggatcta cctccaatca 9000 
cactccacag ttactctcca ggtgaaatca 9060 
gggtoccgcc cttgcgagct tggagacacc 9120 
ccagaggagg cagggctgct atatgtggca 9180 
agctcaaact cactccaata gcggccgctg 9240 
ctggctacag cgggggagac atttatcaca 9300 
ggttttgcct actcctgctc gctgcagggg 9360 
ggttggggta aacactccgg cctcttaagc 9420 
tttttttctt tttttttttc tttcctttcc 9480 
aatggtggct ccatcttagc cctagtcacg 9540 
actgcagaga gtgctgatac tggcctctct 9600 
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<211> 3015 
<212> PRT 

<213> Hepatitis C vims 
<400> 10 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
1 5 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Asp Arg Arg Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 
100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Val Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Phe Ala Thr Gly Asn' Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys lie Thr Thr Pro Val Ser Ala Ala 
180 185 190 

Glu Val Lys Asn lie Ser Thr Gly Tyr Met Val Thr Asn Asp Cys Thr 
195 200 205 

Asn Asp Ser lie Thr Trp Gin Leu Gin Ala Ala Val Leu His Val Pro 
210 215 220 

Gly Cys Val Pro Cys Glu Lys Val Gly Asn Ala Ser Gin Cys Trp lie 
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225 230 235 240 

Pro Val Ser Pro Asn Val Ala Val Gin Arg Pro Gly Ala Leu Thr Gin 
245 250 255 

Gly Leu Arg Thr His He Asp Met Val Val Met Ser Ala Thr Leu Cys 
2S0 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Gly Val Met Leu Ala Ala 
275 280 285 

Gin Met Phe He Val Ser Pro Gin His His Trp Phe Val Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly Thr He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Thr Met He Leu Ala Tyr 
325 330 335 

Ala Met Arg Val Pro Glu Val He He Asp He lie Ser Gly Ala His 
340 345 350 

Trp Gly Val Met Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp 
355 360 365 

Ala Lys Val Val Val He Leu Leu Leu Ala Ala Gly Val Asp Ala Arg 
370 375 380 

Thr His Thr Val Gly Gly Ser Ala Ala Gin Thr Thr Gly Arg Leu Thr 
385 390 395 400 

Ser Leu Phe Asp Met Gly Pro Arg Gin Lys He Gin Leu Val Asn Thr 
405 410 415 

Asn Gly Ser Trp His He Asn Arg Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu His Thr Gly Phe He Ala Ser Leu Phe Tyr Thr His Ser Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Met Ser Ala Cys Arg Ser He Glu Ala 
450 455 460 

Phe Arg Val Gly Trp Gly Ala Leu Gin Tyr Glu Asp Asn Val Thr Asn 
465 470 475 480 

Pro Glu Asp Met Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Gin Cys 
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485 490 495 

aiy Val Val Ser Ala Lys Thr Val Cys Gly Pro Val Tyr Cys Phe Thr 
500 505 510 

Pro Ser Pro Val Val Val Gly Thr Thr Asp Arg Leu Gly Ala Pro Thr 
515 520 525 

Tyr Thr Trp Gly Glu Asn Glu Thr Asp Val Phe Leu Leu Asn Ser Thr 
530 535 540 

Arg Pro Pro Leu Gly Ser Trp Phe Gly Cys Thr Trp Met Asn Ser Ser 
545 550 555 560 

Gly Tyr Thr Lys Thr Cys Gly Ala Pro Pro Cys Arg Thr Arg Ala Asp 
565 570 575 

Phe Asn Ala Ser Thr Asp Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys 
580 585 590 

His Pro Asp Thr Thr Tyr Leu Lys Cys Gly Ser Gly Pro Trp Leu Thr 
595 600 605 

Pro Arg Cys Leu lie Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys 
610 615 620 

Thr Val Asn Tyr Thr He Phe Lys He Arg Met Tyr Val Gly Gly Val 
625 630 635 640 

Glu His Arg Leu ,Thr Ala Ala Cys Asn Phe Thr Arg Gly Asp Arg Cys 
645 650 655 

Asn Leu Glu Asp Arg Asp Arg Ser Gin Leu Ser Pro Leu Leu His Ser 
660 665 670 

Thr Thr Glu Trp Ala He Leu Pro Cys Ser Tyr Ser Asp Leu Pro Ala 
675 680 685 

Leu Ser Thr Gly Leu Leu His Leu His Gin Asn He Val Asp Val Gin 
690 695 700 

Phe Met Tyr Gly Leu Ser Pro Ala Leu Thr Lys Tyr He Val Arg Trp 
705 710 715 720 

Glu Trp Val He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys 
725 730 735 

Ala Cys Leu Trp Met Leu He Leu Leu Gly Gin Ala Glu Ala Ala Leu 
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740 745 750 

Glu Asn Leu Val He Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly 
755 760 765 

Leu Val Ser Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly 
770 775 780 

Arg Trp Val Pro Gly Ala Val Tyr Ala Leu Tyr Gly Met Trp Pro Leu 
785 790 795 800 

Leu Leu Leu Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr 
805 810 815 

Glu Val Ala Ala Ser Cys Gly bly Val Val Leu Val Gly Leu Met Ala 
820 825 830 

Leu Thr Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp 
835 B40 845 

Trp Leu Gin Tyr Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp 
850 855 860 

Val Pro Pro Leu Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu 
865 870 875 880 

Met Cys Val Val His Pro Thr Leu Val Phe Asp He Thr Lys Leu Leu 
885 890 895 

Leu Ala He Phe Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys 
900 905 910 

Val Pro Tyr Phe Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu 
915 920 925 

Ala Arg Lys He Ala Gly Gly His Tyr Val Gin Met Ala He He Lys 
930 935 940 

Leu Gly Ala Leu Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu 
945 950 955 960 

Arg Asp Trp Ala His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu 
965 970 975 

Pro Val Val Phe Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala 
980 985 990 

Asp Thr Ala Ala Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala 
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995 1000 1005 

Arg Arg Gly Gin Glu lie Leu Leu Gly Pro Ala Asp Gly Met Val Ser 
1010 1015 1020 

Lys Gly Trp Arg Leu Leu Ala Pro lie Thr Ala Tyr Ala Gin Gin Thr 
1025 1030 1035 1040 

Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp Lys 
1045 1050 1055 

Asn Gin Val Glu Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr 
1060 1065 1070 

Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly 
1075 1080 1085 

Ala Gly Thr Arg Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met 
1090 1095 1100 

Tyr Thr Aan Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly 
1105 1110 1115 1120 

Ser Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu 
1125 1130 1135 



Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser 
1140 1145 1150 

Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly Ser 
1155 1160 1165 

Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe 
1170 1175 1180 

Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe He 
1185 1190 1195 1200 



Pro Val Glu Asn Leu Gly Thr Thr Met Arg Ser Pro Val Phe Thr Asp 
1205 1210 1215 

Asn Ser Ser Pro Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu 
1220 1225 1230 

His Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr 
1235 1240 1245 

Ala Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala 
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1250 1255 1260 

Thr Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro 
1265 1270 1275 1280 

Asn He Arg Thr Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr 
1285 1290 1295 

Tyr Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly 
1300 1305 1310 

Ala Tyr Asp He He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr 
1315 1320 1325 

Ser He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly 
1330 1335 1340 

Ala Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr 
1345 1350 1355 1360 

Val Ser His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu 
1365 1370 1375 

He Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly 
1380 1385 1390 

Arg His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala 
1395 1400 1405 

Ala Lys Leu Val Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly 
1410 1415 1420 

Leu Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ser 
1425 1430 1435 1440 

Thr Asp Ala Leu Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He 
1445 1450 1455 

Asp Cys Asn Thr Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro 
1460 1465 1470 

Thr Phe Thr He Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg 
, 1475 1480 1485 

Thr Gin Arg Arg Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg 
1490 1495 1500 

Phe Val Ala Pro Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val 
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1505 1510 1515 



1520 



Leu Cys Glu Cys Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro 
1525 1530 1535 

Ala Glu Thr Thr Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu 
1540 1545 1550 

Pro Val Cys Gin Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly 
1555 1560 1565 

Leu Thr His lie Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly 
1570 1575 1580 

Glu Asn Phe Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg 
1585 1590 1595 igoQ 

Ala Gin Ala Pro Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu lie 
1605 1610 1615 

Arg Leu Lys Pro Thr Leu His Gly Pro Thr Pro Leu Leu, Tyr Arg Leu v 
1620 1625 1S30 

Gly Ala Val Gin Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr 
1635 1640 1645 

He Met Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp 
1550 1655 1660 

Val Leu Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser 
1665 1670 1675 1680 

Thr Gly Cys Val Val He Val Gly Arg He Val Leu Ser Gly Lys Pro 
1685 1690 1695 

Ala He He Pro Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met 
1700 1705 1710 

Glu Glu Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu 
1715 1720 1725 

Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser 
1730 1735 1740 

Arg His Ala Glu Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys 
1745 1750 1755 1760 

Leu Glu Val Phe Trp Ala Lys His Met Trp Asn Phe He Ser Gly He 
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1765 1770 1775 

Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala 
17B0 1785 1790 

Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly 
1795 1800 1805 

Gin Thr Leu Leu Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu 
1810 1815 1820 



Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly 
1825 1830 1835 1840 

Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu Val Asp He Leu 
1845 1850 1855 

Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He 
I860 1865 1870 

Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro 
1875 1880 1885 

Ala He Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala 
1890 1895 1900 

He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Qln Trp Met 
1905 1910 1915 1920 

Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr 
1925 1930 1935 

His Tyr Val Pro Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu 
1940 1945 1950 

Ser Ser Leu Thr Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He 
1955 1960 1965 

Ser Ser Glu Cys Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He 
1970 1975 1980 

Trp Asp Trp He Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys 
1985 1990 1995 2000 

Ala Lys Leu Met Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin 
2005 2010 2015 

Arg Gly Tyr Arg Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg 
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2020 2025 2030 

Cys His Cys Gly Ala Glu He Thr Gly His Val hya Asn Gly Thr Met 
2035 2040 2045 

hxg He Val Gly Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe 
2050 2055 2060 

Pro He Asn Ala Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro 
2065 2070 2075 2080 

Asn Tyr Lys Phe Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu 
2085 2090 2095 

He Arg Arg Val Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp 
2100 2105 2110 

Asn Leu Lys Cys Pro Cys Gin He Pro Ser. Pro Glu Phe Phe Thr Glu 
2115 2120 2125 

Leu Asp Gly Val Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu "-- 
2130 2135 2140 

Leu Arg Glu Glu Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val 
2145 2150 2155 2160 

Gly Ser Gin Leu Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr 
2165 2170 2175 

Ser Met Leu Thr Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg 
2180 2185 2190 

Arg Leu Ala Arg Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser 
2195 2200 2205 

Gin Leu Ser Ala Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp 
2210 2215 2220 

Ser Pro Asp Ala Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu 
2225 2230 2235 2240 

Met Gly Gly Asn He Thr Arg Val Glu Ser Glu Asn Lys Val Val He 
2245 2250 2255 

Leu Asp Ser Phe Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val 
2260 2265 2270 

Ser Val Pro Ala Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala 
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2275 2280 2285 

Leu Pro Val Trp Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Qlu Thr 
2290 2295 2300 

Trp Lys Lys Pro Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu 
2305 2310 2315 2320 

Pro Pro Pro Arg Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr 
2325 2330 2335 

Val Val Leu Thr Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala 
2340 2345 2350 

Thr Lys Ser Phe Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn 
2355 2360 2365 

Thr Thr Thr Ser Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser 
. 2370 2375 2380 

Asp Val Glu Ser Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly 
2385 2390 2395 2400 

Asp Pro Asp Leu Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala 
2405 2410 2415 

Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly 
2420 2423 2430 

Ala Leu Val Thr Pro Cys Ala Ala Qlu Glu Gin Lys Leu Pro He Asn 
2435 2440 2445 

Ala Leu Ser Asn Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr 
2450 2455 2460 

Thr Ser Arg Ser Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg 
2465 2470 2475 2480 

Leu Gin Val Leu Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys 
2485 2490 2495 

Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala 
2500 2505 2510 

Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly 
2515 2520 2525 

Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Ala His He Asn 
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2530 2535 2540 

Ser Val Trp Lys Asp Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr 
2S4S 2550 2555 2560 

Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly 
2S65 2570 2575 

Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu Gly Val Arg 
2580 2585 2590 

Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu 
2595 2600 2605 

Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg 
2610 2615 2620 

Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly 
2625 2630 2635 2640 

Phe Ser Tyr Asp Thr Arg Cys Phe Asp. Ser Thr Val Thr Glu Ser Asp - • 
2645 2650 2655 

lie Arg Thr Glu Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin 
2660 2665 2670 

Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly 
2675 2680 2685 

Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg 
2690 2695 2700 

Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr 
2705 2710 2715 2720 

He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr 
2725 2730 2735 

Met Leu Val Cys Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly 
2740 2745 2750 

Val Gin Glu Asp Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr 
2755 2760 2765 

Arg Tyr Ser Ala Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu 
2770 2775 2780 

Glu Leu He Thr Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly 
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2800 



Ala Gly Lys Arg Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu 
2805 2810 2815 

Ala Arg Ala Ala Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp 
2820 2825 2830 

V 

Leu Gly Asn He He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He 
2835 2840 2845 

Leu Met Thr His Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu 
2850 2855 2860 

Gin Ala Leu Asn Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro 
2865 2870 2875 2880 

Leu Asp Leu Pro Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe 
2885 2890 2895 

Ser Leu His Ser Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys 
2900 2905 2910 

Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala 
2915 2920 2925 

Arg Ser Val Arg Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He 
2930 2935 2940 

Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu 
2945 2950 2955 2960 

Thr Pro He Ala Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr 
2965 2970 2975 

Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg 
2980 2985 2990 

Pro Arg Trp Phe Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly 
2995 3000 3005 

He Tyr Leu Leu Pro Asn Arg 
3010 3015 



<210> 11 
<211> 24 
c:212> DMA 
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<213> Hepatitis C virus 
<400> 11 

actggacacg gaggtggccg cgtc 24 



<210> 12 

<211> 24 

<2X2> miK 

<213> Hepatitis C virus 

<400> 12 

ttgttcttgt cgggttaatg gcgc 24 



<210> 13 
<211> 24 
<212> DNA 

<213> Hepatitis C virus 
<400> 13 

gggtgtacta cacacatgag taag 24 



<210> 14 
<211> 22 
c212> DNA 

<213> Hepatitis C virus 
<400> 14 

aagcgcccct aacttgatga tg 22 



<210> 15 

<211> 40 
<212> DNA 

<213> Hepatitis C virus 
<400> 15 

cgtcatcgat acctcagcgg gcatatgcac tggacacgga 40 



<210> 16 
<2H> 24 
<:212> DNA 

<:213> Hepatitis C virus 
<400> IS 
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24 



<210> 17 
<211> 32 
<212> DMA 

<213> Hepatitis C virus 
<400> 17 

oatgcaccag ctgatatagc gcttgtaata tg 32 



<210> 18 
<211> 30 

<212> DNA 

<213> Hepatitis C virus 
<400> 18 

tccgtagagg aagcttgcag cctgacgccc 30 



<210> 19 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
<4005. 19 

cagaggaggc agggctgcta tatgtggcaa gtac 34 

<210> 20 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
<400> 20 

gtacttgcca catatageag ccctgcctcc tctg 34 



<210> 21 
<211> 43 
<212> DNA 

<213> Hepatitis C virus 
<400> 21 

cgtctctaga caggaaatgg cttaagaggc cggagtgttt ace 43 
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<210> 22 
<2H> 65 
<212> DNA 

<213> Hepatitis C virus 
<400> 22 

ttatggatgc tcatcttgtt gggccaggcc gaagcagctt tggagaacct cgtaatactc 60 
aatgc g5 



<210> 23 
<2H> 32 
<212> DNA 

<213> Hepatitis C virus 
<400> 23 

aggatttgtg ctcatggtgc acggtctacg ag 32 



<210> 24 
<211> 50 
<212> DNA 

<213> Hepatitis C virus 
<400> 24 

ttttttttgc ggccgctaat acgactcact atagacccgc ccctaatagg so 

<210> 25 
<211> 31 
<212> DNA 

<213> Hepatitis C virus 
<400> 25 

ccgtgcacoa tgagcacaaa tcctaaacct c 31 



<210> 26 
<211> 26 
<212> DNA 

<213> Hepatitis C virus 
<4O0> 26 

ggatgtaccc catgaggtcg gcaaag 26 



<210> 27 
<211> 30 
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<212> DNA 



<213> Hepatitis C virus 



<400> 27 



gtttgcgcct gcttatggat gctcatcttg 



30 



<210> 28 
<211> 26 
<212> DNA 

<213> Hepatitis C virus 
<400> 28 

gcgtcataag catatgcctg ttgggg 26 

<210> 29 
<211> 23 
<212> DNA 

<213> Hepatitis C virus 



<210> 30 
<211> 39 
<212> DNA 

<:213> Hepatitis C virus ■ 
<400> 30 

cgtcatgcat acccctaggg cggctctcat tgaagaggg 39 

<210> 31 
<2X1> 30 
<212> DNA 

<213> Hepatitis C virus 



<400> 29 



ccctcagcac tggagtacat ctg 



23 



<400> 31 



cgtcccctct tcaatgagag ccgctctaga 



30 



<210> 32 



<211> 28 



<212> DNA 



<213> Hepatitis C virus 
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<400> 32 

gcggtgaaga ccaagctcaa actcactc 28 



<210> 33 
<211> 41 
<212> DNA 

<213> Hepatitis c virus 
<400> 33 

aatctagaag gcgcgcttcc ggcaatggag tgagtttgag c 41 

<210> 34 
<2H> 38 
<212> DNA 

<213> Hepatitis C virus 
<400> 34 

cgtctctaga ggataaatcc aggaggcgcg cttccggc 38 



<210> 35 
<211> 27 
<212> DMA 

<213> Hepatitis <? virus 
<400> 35 

tactttttgt aggggtaggc cttttcc 27 



<210> 36 
<211> 34 
<212> DNA 

<213> Hepatitis C virus 
<400> 36 

cgtctctaga gtgtagctaa tgtgtgccgc teta 34 



<210> 37 
<211> 24 
<212> DNA 

<213> Hepatitis C virus 
<400> 37 

ctatggagtg tagctaatgt gtgc 24 
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<210> 38 
<211> 66 
<212> DNA 

<213> Hepatitis C virus 
<400> 38 

cgtctctaga catgatctgc agagagacca gttacggcac tctctgcagt catgcggctc 60 
acggac 66 



<210> 39 
<211> 41 
<212> DNA 

<213> Hepatitis C virus 
<400> 39 

ctttcacagc tagccgtgac tagggctaag atggagccac c 41 
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RJRTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



This International Searching Authority found multiple (groups of) 
inventions in this international application, as follows: 

1. Claims: 1-11, 33, 34, 

37 completely and partially claims 12-20, 23, 24, 
29-32, 35, 36 and 37 



A purified and isolated nucleic acid molecule which encodes 
human hepatitis C virus of genotype 2a, DNA constructs 
comprising said nucleic acid, RNA transcript of said 
construct, cell transfected with said transcript, hepatitis 
C virus polypeptide produced by said cell and whose genome 
comprises said nucleic acid, method for assaying candidate 
antiviral agents against for activity against HCV using said 
cell containing HCV, antibody to said polypetide or to said 
HCV, method for determining the susceptibility of cells in 
vitro to support HCV infection using the cells transfected 
with the nucleic acid of claim 1 and compositions comprising 
said polypeptide suspended in a pharmaceutical 1y acceptable 
diluent or excipient. 



2. Claims: 25 and 26 completely and 12-23, 24, 29-32, 35, 
35 and 37 partially 



A hepatitis C virus polypeptide produced by a cell 
transfected with a DNA construct comprising a nucleic acid 
molecule which encodes human hepatitis C virus of genotype 
2a which is an NS3 protease and method for assaying 
candidate antiviral agents against for activity against HCV 
comprising exposing said HCV protease to candidate antiviral 
agents, antibody to said polypeptide or HCV and compositions 
comprising said polypeptide suspended in a pharmaceutical ly 
acceptable diluent or excipient. 



3. Claims: 12-23, 24, 29-32, 35, 36 and 37 partially 

A hepatitis C virus polypeptide produced by a cell 
transfected with a DNA construct comprising a nucleic acid 
molecule which encodes human hepatitis C virus of genotype 
2a which is an El protein, antibody to said polypeptide or 
HCV and compositions comprising said polypeptide suspended 
in a pharmaceutlcail ly acceptable diluent or excipient.. 



4. Claims: 12-23, 24, 29-32, 35, 36 and 37 partially 

A hepatitis C virus polypeptide produced by a cell 
transfected with a DNA construct comprising a nucleic acid 
molecule which encodes human hepatitis C virus of genotype 
2a which is an E2 protein, antibody to said polypeptide or 
HCV and compositions comprising said polypeptide suspended 
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in a pharmaceutical ly acceptable diluent or exciplent. 



5. Claims: 12-23, 24, 29-32, 35, 35 and 37 partially 

A hepatitis C virus polypeptide produced by a cell 
transfected with a DNA construct comprising a nucleic acid 
molecule which encodes human hepatitis C virus of genotype 
2a which is an NS4 protein, antibody to said polypeptide or 
HCV and compositions comprising said polypeptide suspended 
in a pharmaceutically acceptable diluent or excipient.. 



6. Claim : 27 and 28 completely 



Antiviral agent identified as having antiviral activity for 
HCV by the method of claims 23 and/or 25. 
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